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METHODS OF SEQUENTIAL ESTIMATION FOR DETERMINING INITIAL DATA 
IN NUMERICAL WEATHER PREDICTION 
Stephen E. Cohn 

Advisors: Michael Ghll and Eugene Isaacson 

ABSTRACT 


Numerical weather prediction (NWP) is an lnltlal'*value problem for 
a systan of nonlinear partial differential equations, in which initial 
values are known incompletely and inaccurately. Observational data 
available at the initial time must therefore be supplemented by data 
available prior to the initial time, a problem known as meteorological 
data assimilation. 

A further complication in NWP is that solutions of the governing 
equations evolve on two different time scales, a fast one and a slow 
one, whereas fast scale motions in the atmosphere are not reliably 
observed. This leads to the so-called Initialization problem: initial 

values mist be constrained to result in a slowly evolving forecast. 

The theory of estimation of stochastic-dynamic systems provides a 
natural approach to such problems. For linear stochastic-dynamic 
models, the Kalraan-Bucy (KB) sequential filter is the optimal data 
assimilation method. We show that, for linear models, the optimal 
combined data assimilation-initialization method is a modified version 
of the KB filter. This modified KB filter combines the standard KB 
filter with a projection onto the slow solution subspace. 

The shallow-water equations are a simple system whose solutions 
exhibit many features of large-scale atmospheric flow important in NWP. 
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We Inplaraent the standard and modified KB filters for a llnearizied 
version of these equations , given a simple observational pattern* The 
numerical results s h ow tlvit the modified filter produces a slowly 
evolving forecast, at the expense of forecast errors only slightly 

larger than those incurred by using the standard KB filter* 

A statistical data assimilation method widely used at NWP centers 
Is known as optimal interpolation (01)* We Implement 01 for the 
shallow-water model, and we use the estimation-theoretic framework to 
compare the performance of 01 with that of the standard and modified KB 
filters. 

Numerical results show tlrat the simplifying assumptions involved 
In 01 lead to relatively large errors near boundaries separating 

data-dense and data-sparse regions, and that proper initialization is a 
partial cure for this boundary effect. We show also how estimation 
theory can be used to tune the free parameters involved in 01, in such 

a way that the tuned scheme performs roughly as well as the modified KB 


filter. 
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CHAPtER ONE 
INTRODUCTION 

1* 1 • Description of the Problem and Outline of Rewults 

One of the nsln reasons vhy we cannot tell what tha weather will 
be tomorrow Is that we do not know what the vieather la today* In other 
words, numerical weather prediction (NWP) Is an Inltlal-^alue problem 
for which Initial data are not available In sufficient quantity or with 
sufficient accuracy* 

Numerical forecasts are now produced routinely by weather services 
In many countries* The models used In NWP are discretized versions of 
the partial differential equations which govern large-scale atmospheric 
flow. The spatial domain of many models surrounds either the entire 
globe or at least an entire hemisphere* Values of the atmospheric 
variables must be specified over a regular three-dimensional mesh at 
each Initial forecast time. 

Observational data, collected from a variety of ground-based, 
airborne and space-borne observing systems, are distributed very 
Irregularly In space and time. At any single time, the data are too 
sparse over most of the globe to determine a complete set of Initial 
values. Observational data are also subject to significant random, 
systematic and correlated errors. 

Data available at the Initial time of each forecast must therefore 
be supplemented with Information from previous observations* The 
attempt to provide Initial values for NWP models by use of all 
available data Is known as four -d Ime ns Iona 1 data assimilation* The 
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adjActlve ”four~dl<aenilonal'' •mphiSiiaf that data diatributad In both 
time and apace must be used* 

The current data assimilation practice at NWP centers la to 
linearly combine observations available at the Initial time with a 
forecast Issued from previous observations. A new forecast Is then 
issuedi and the process Is repeated as further observations become 
available: the forecast model assimilates the data. The problem of 
four-dlmenslonal data assimilation Is essentially that of determining 
the best way to combine forecast and observed values. 

The peculiar dynamics of the earth's atmosphere present a further 
difficulty In the determination of Initial values. Namely* hWP Is a 
problem with two time scales . In which motion on the fast time scale Is 
not reliably observed. 

The system of nonlinear partial differential equations which 
governs the atmosphere's dynamics admits two types of solutions: 
rapidly evolving solutions* which In the earth's atmosphere consist 
mostly of Inert la -gravity waves , and slowly evolving solutions* 
consisting mostly of Rossby waves . In the earth's atmosphere* the 
fast-scale motions occur mainly on small spatial scales Which are 
resolved neither by the observational network nor by global NWP models. 
Fortunately, the fast-scale components of motion typically carry much 
less energy than the slow-scale components. On spatial scales which 
NWP models are designed to resolve* the slow motions are the 
significant ones. 

Initial data for NWP models must be chosen accordingly: the data 
tmist be constrained to result In a slowly evolving forecast. 
Improperly chosen Initial data lead to spurious fast waves which appear 


•• large ) tranalant error* in the forecaat of every ■eteorologlcel 
variable. Foracaete of vertical not ion » and therefore precipitation 
forecaatSi are particularly affected by the epurioue wave*. 

The process of adjusting initial data so that a slowly evolving 
forecast ensues is known as initialization . Customarilyi a data 
assimilation method provides "first grass" initial datai which are then 
adjusted by application of an initialization scheme. 

The estimation theory of stochastic-dynamic systems provides a 
natural framework for studying the data assimilation-initialization 
problem. In the approach of estlmatiovi theory, the evolution of the 
atmosphere is assumed to differ from that of a given NWP model by 
random increments; the random increments ate meant to account for 
modeling errors. Thus, the "true" atmospheric state is governed by a 
stochastic-dynamic model. Observations are treated similarly! they are 

noisy "output" of the stochastic-dynamic atmospheric model. 

♦ 

In the context of estimation theory, the data assimilation problem 
Is that of estimating the "true" state, given the unperturbed, 
Imperfect NWP model, and given Inaccurate, Incomplete observations of 
the "true" state. The initialization problem Is that of eonstrainlng 
the state estimates, or forecast, to evolve slowly. 

In this dissertation, we apply estimation theory to study the data 
assimilation-initialization problem, In two ways. First, we formulate 
a combined data assimilation-initialization method which is 
statistically optimal for linear stochas tic-dynamic models. Second, by 
means of numerical experiments with a simple linear model, we compare 
this method with the method of "optimal Interpolation" which Is widely 
used at NWP centers. 


For llnoar •tochastlc'^ynanic nodala, tha dlacrata KalMan-Bucy 
(KB) filtar of aatinatlon theory ia the •tatiatically optiaal data 
aapinllation nathod* The KB filter la optloal In that it 
ainultaneoualy niniinlzea all quadratic meaaurea of tha eatiiation 
error, i.e. , all quadratic functional^ of the difference between the 
eatiiaatei atate and the true atate* Like data aaaiiailation aatthoda in 
operational uae at NWP centera, the KB filter proceaaea data 
aequentiallyi obaervationa are diacarded once they have been procesaed, 
ao that past obaervationa are uaed only in the form of a forecaot 
laaued from them. The optimality and aequential nature of the KB 
filter has led to its successful use in a wide variety of engineering 
problems . 

The KB filter does not automatically provide slowly evolving state 
estimates, however, and therefore it is not directly applicable to NWP. 
To find a combined data assimilation-initialization algorithm, we solve 
a constrained minimization problem. Namely, we require that a 
quadratic functional of the estimation error be minimized, subject to 
the constraint that the ensuing state estimates will evolve slowly. 

The solution turns out to be a modified form of the standard KB 
filter. It is given by multiplying the usual KB gain matrix , which 
specifies the linear combination of forecast and observed values, by a 
matrix which projects onto the set of data which lead to slowly 
evolving forecasts. That is, the KB gain matrix Is multiplied by a 
projection matrix onto the model's slow-wave subspace . The projection 
matrix depends on the choice of error functional, and thus a tradeoff 
is involved in constraining the estimates to evolve slowly: the 
modified filter depends on the choice of error functional, whereas the 
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•tifidard KB fUt«r do«f not. When uilng th* •odlflod KB on* 
suit chooi* the error functional to be wlnlmlKed. 

The modified KB filter is equivalent to combining the atandard KB 
filter with the method of variational linear normal mode Initlallgation 
which is used at NWP centers. In this initialization method, the first 
guess estimate provided b/ a d^i^ta assimilation scheme is projected onto 
the slow-wave subspace, in such a way as to minimize a quadratic 
functional of the difference between the first guess and the 
initialized esttmaLe. Inasmuch as the first guess in our case is 
provided by the KB filter, the modified filter minimizes also a 
functional of the difference between the t rue s tat e and the initialized 
estimate. This property is important in deciding upon an appropriate 
error functional to be minimized. 

A simple model whose solutions exhibit many features of 
large-scale atmospheric flow. Including the two time-scale behavior, is 
the one governed by the shallow-water equations . Me Implement the 
standard and modified filters for a linear, one-dlmeoslonal version of 
the shallow-water equations. We use a simplified observing pattern 
based on the conventional meteorological observing network. 

One of the advantages of the estimation-theoretic framework is 
that it provides a way of assessing the performance of data 
assimilation schemes: the estimation error variances evolve in a known 
way. We make use of this fact to compare the performance of the 
standard and modified filters in our shallow-water model. Tlie results 
show that the modified filter does indeed produce slowly evolving 
estimates, and at the expense of estimation errors only slightly larger 
than those of the standard KB filter. 


Our nuiMricul cxpciitlwiintf ilso dtii0Mtr«te that the tvo filters 

t 

autoiutlcally determine observatlonai weights in accordance with local 
data density and with tlte amount of Information advected between 

dats-denso and deta*'Spafse regions • In particular | the filters are 
able to discern between data^sparse regions located upstream and 
downstream from a data”dense region* 

For our second applies ^lon of estimation theory* we Implement a 
version of optimal Interpolation (01) for our shallow-water model. 
Ihls data assimilation method Is In use at a number of NWP centers and 
Is under development at several others. Optimal Interpolation Is based 
on a number of assumptions concerning forecast error correlations and 
the evolution of forecast error variances. 

We use the estimation-theoretic framework to assess the 

performance of 01 In our model* and we compare the performance of 01 
with that of the standard and modified KB filters. Our numerical 
results suggest that the simplifying statistical assumptions Involved 
In 01 lead to relatively large errors near boundaries separating 
data-dense and data-sparse regions. We show that proper Initialisation 
Is a partial cure for this boundary effect , In that Inltlaliratlon 
helps keep estimation errors localized. 

A number of free parameters, such as forecast error variance 
growth rates over different regions, are specified In 01 schemes. We 

show that, by monitoring the*, size of estimation errors, these 

parameters may be adjusted, or tuned. The results show that, when the 
growth rates are properly tuned, the performance of an Initialized 
version of 01 Is roughly comparable to that of the modified KB filter. 


The tuning Is a my of allowing the 01 schene to Mke aone use of 
advected Infotiaatlon. 

We hope that our resulta, both theoretical and nua|erlcal» lead to 
a better understanding of the interaction between Initialisation and 
four-dimensional data assimilation. 

In Section 1.2, we review several aspects of NWP modeXs and 
meteorological observing systeiiis^ In order to acquaint the reader with 
some of the practical considerations Involved in numerical weather 
prediction. We also discuss the general foirmulatlon of data 
assimilation methods and we discuss normal mode Initialisation methods 
In some detail. 

Xn Chapter 2, we review the relevant aspects of estimation theory. 
We show, In particular, how estimation theory can be used to assess the 
performance of data assimilation schemes. A simple derivation of the 
KB filter is presented also. 

In Chapter 3, we Introduce the linearised shallow-water equations, 
and we formulate and discuss their slow-wave subspace. The discrete 
version of the shallow-water equations upon which our numerical 
experiments are based is given in Chapter 4, where we also define the 
slow-wave subspace of the discrete shallow-water model. The modified 
filter will depend upon the slow-wave subspace of the discrete model, 
rather than upon that of the original differential equations. 

Our main theoretical results appear in Chapter 5. After 
preliminary remarks in Section 5.1 and a review of projection matrices 
in Section 5.2, the modified KB filter is presented in Section 5.3; It 
Is given by Theorem 1. An efficient method for computing the projection 
matrix upon which the modified filter depends Is given by Theorem 2 of 


Stctlon 5>A. We describe e variety of choices for the error 
functional, and the correspondlns projection matrices, In Section 5*5. 

The description and Tresults of our numerical experiments with the 
standard and .modified KB filters are given in Chapter 6; the 01 
experiments are deBcrlbed In Chapter 7* A preliminary version of the 
experiments In Chapter 6 was reported In Ghll ^ ad* (1981). The 
results In Chapter 7 are summarized in Cohn e^al^* (1981). 

Theorems 1 and 2 of Chapter 5, and also the lemmas of Sections 5*2 
and 5*4, are proven In the Appendix* 

1»2* Background on Numerical Weather Prediction 

Most forecast models used In NWP are discretized versions of the 
so-called primitive equations, which are the Eulerlan hydrodynamlcal 
equations modified by the hydrostatic assumption. Finite difference 
and spectral methods are used most often for the dlscretlzatlohl finite 
element methods are used to a much lesser extent* The models are fully 
three-dimensional, depending on a vertical coordinate and on two 
horizontal coordinates. Global, hemispherical and limlted-arra models 
are all In use* See Haltlner and Williams (1980) for a full treatment 
of numerical modeling in NWP* 

The highest resolution global and hemispherical models have 
10^-10^ degrees of freedom — a large number even by present 
computational standards. Still, this resolution corresponds to a 
horizontal mesh spacing of 100-200 km, which Is not adequate for local 
prediction. Local forecasts are carried out by use of llmlted-aroa 
flne-mesh models and by subjective judgement* In any case, local 
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forccaits are based upon forecasts provided by global (or 
hemispherical) models. 

Errors are Incurred at each step of the prediction process. Some 
error Is made at the local levels The global models are another source 
of error, primarily due to discretization and improper modeling of 
physical processes. Finally, there is error in the determination of 
global initial values . 

Initial error is important, especially because the resulting 
forecast error grows rapidly. This growth is a consequence of the 
atmosphere's nonlinear dynamics, and is not an ‘artifact of the model. 
In stable linear systems, the effect of an initial perturbation tends 
to zero with time. The atmosphere is nonlinear and has locally 
unstable modes. Perturbations at small scales of motion are 
nonlinearly fed into the larger scales and eventually grow enough to 
completely contaminate a forecast. The atmosphere is in this sense 
unpredictable . 

Three approaches have been used to determine the rate at which 
predictability of large-scale atmospheric flow is lost. Lorenz (1969a) 
examined a five-year observational data set for atmospheric 
"analogues", or pairs of similar states, and studied their divergence 
in time. In the second approach (Charney et al . , 1966; Williamson and 
Kasahara, 1971), one studies Instead the divergence of pairs of 
numerical forecasts issuing from slightly different initial states. In 
the third approach (Leith and Kraichnan, 1972; Lorenz, 1969b), the 
transfer of error between different scales of motion, based on a 
presumed atmospheric energy spectrum, is calculated by means of 
statistical theories of turbulence. 


^ 10 - 

The three approaches give the sane quantitative results* For 
scales of motion resolved by current NWP models* two atmospheric states 
differing initially by h small amount diverge exponentially at first* 
with an rms error<’doubling time of about 2*-3 days* Errori level off in 
about 2-3 wef^s* after which time the two states become statistically 
uncorrelated * 

Errors in initial states are due to the incompleteness and 
Inaccuracy of data provided by the global observing network* Here we 
briefly describe the observing systems and the error structure of the 
data they provide. For a more complete discussion we refer to 
Bengtsson (1975) and Fleming et ^* (1979a*b)* 

The largest number of obsiirvatlons made at a single time each day 
are made at the so-called synoptic times, 0000 and 1200 hours Greenwich 
Mean Time (GMT); the synoptic times are chosen as initial forecast 
times. Synoptic data are provided by the conventional observing 
network of surface stations and radiosondes. In Figure 1* in the 
panels marked "surface"* "pilots" and "temps"* we show the distribution 
of conventional observations available at 1200 GMT on January 9* 1979* 
The uneven spatial distribution of conventional data is clear: 
observations are concentrated over the continents* especially those of 
the Northern Hemisphere* 

A number of additional observations, mostly surface observations* 
are provided by the conventional network at the subsynoptic times* 0600 
and 1800 QMT. K much larger number of observations* exceeding by now 
that given by the conventional network* are made In an essentially 
time-continuous manner by polar-orbiting satellites* geostatiomry 
satellites and other nonconvent ional observing systems (remaining 
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panels in Fig. 1)* Satf]Ulte observations graatly improve upon the 
spatial coverage of conventional observations » but their usefulness in 
providing initial data Is limited by the fact that satellite coverage 
of the globe is incomplete at any one time. 

Each observing system has its ovm particular error 
characteristics. The conventional network provides point values of 
pressure* temperature, horizontal velocity and humidity. These fields 
are highly variable, and hence point measurements are not 
representative of Volume averages* as they should be for numerical 
models. Although instrumental errors are relatively small* the total 
observational errors of conventional data can be quite large. 

Observational errors from nonconventlonal measurements are often 
even larger. Geostationary satellites ("satwind'* in Fig. 1)* for 
example* provide sequential cloud images from which horizontal wind 
velocities are deduced. Velocity errors in this case are large* 
primarily because of difficulty In determining the vertical location of 
the clouds being tracked. 

Observational errors are also correlated in a number of ways. 
Errors from radiosonde measurements are spatially correlated* as the 
sondes rise through the atmosphere. Pols ^•orbiting satellites 
("sateras" In Fig. 1) measure radiances at different wavelengths* from 
which vertical temperature profiles are deduced. The errors in a given 
profile are vertically correlated; profile errors are also horizontally 
correlated along the satellite track. Sequences of measurements from 
any single instrument are also likely to have temporally correlated 


errors. 


The Inaccuracy and Incompletenett of ohiervatlonal data available 
at any single time gives rise to the necissity of four -dimensional data 
assimilation. Bube (1978, 1981) has studied some theoretical aspects 
of data assimilation for general first-order linear hyperbolic systems. 
He gives conditions under which solutions of such equations are 
uniquely determined by nonstandard data, i.e. , by data other than 
complete initial data. for the method of direct insertion of 
nonstandard data during an integration, he shows how the rate of 
convergence toward the solution depends on the frequency with which 
data are available for Insertion. In a similar theoretical study, 
Talagrand (1977, 1981) has examined the convergence of direct insertion 
methods based on both forward and forward-backward integration, for 
linearized versions of the sliallow-water equations and of the primitive 
equations. Both studies assume perfect observations. 

Direct insertion, or replacement of forecast values with observed 
values, is not desirable in practice, because of observe tionr*! error 
and forecast error. Rather, an appropriate combination of observed and 
forecast values is sought. The usual procedure is as follows. 

At a given time when observations are available, differences are 
formed between the observed and forecast values, after an interpolation 
between grid points and observation locations. Thus, if w° is the 
vector of observations , if w^ is the vector of forecast values at all 
the grid points, and if H is a matrix which Interpolates from grid 
points to observation locations, then the observed-mlnus-f orecast 
residual is given by w°-Hw^. Once the residual vector has been 
determined, it is multiplied by a matrix K of weighting coefficients, 
and then added back to the forecast vector. The result. 


// 

Vv, 

( 1 . 1 ) 

1b knovm as the analysis vector, since It represents an "analysis" of 
the available observations. The forecast then proceeds, using the 
analysis vector as Initial data, and the entire process Is repeated 
when new observations become available. Thus, the schemes are 
sequential , In that observations are discarded once they have been 
processed. 

The central problem of four-dlmenslonal data assimilation is to 
determine an appropriate choice for the matrix K, as this matrix 
characterizes an assimilation scheme. This matrix Is known In 
estimation theory as a gain matrix . In global NUP, gain matrices are 
very large, on the order of 10^ x 10^, so that almost all elements must 
be zero In order to leave a tractable computational problem. In actual 
practice, data assimilation Is always carried out locally, so that gain 
matrices are block diagonal and have rather small bandwldths. 

Until very recently, most assimilation methods used relatively 
little statistical Information. The most popular such method, known as 
the successive correction method, was developed by Bergthorsson and 
DbUs(1955) and by Cressman (1959). In this method, weights assigned to 
observational data surrounding a grid point are functions of radial 
distance only; scans of the data over successively smaller radii from 
each grid point are employed, so that a smoothly varying analysis field 
results. A complete review of assimilation methods. Including the 
successive correction method, appears in Gustavsson (1981). 
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w* ■ w^ + X(w® - Hw^), 


We have already aeen that the error •tructur^i of ■atelllte 
observations is quite different than that of conventional observations* 
Thus, as nonconventlonal, satolllte-based observations becane 
available, It was realised that nonstatlstlcal procedures would no 

longer suffice > 

Statistical assimilation was first suggested by Ellassen (1954) 
and by Gandln (1963). The statistical schemes In current use are known 
as "optimal Interpolation" methods. In these methods, the gain matrix 
Is based upon a presumed forecast error covariance mc^trlx and Its 
presumed evolution In time. Current formulations of 01 are 

Biltlvarlate, In that forecast error correlations between different 
atmospheric variables are prescribed (Rutherford, 1973, 1976; 

Schlatter, 1975; Schlatter al. . 1976). Statistical assumptions 

based on the atmosphere's approximate dynamics are Involved In 
specifying the correlations. 

In Ctiapter 2 We will see tKat, like 01, the KB filter Is a 

statistical, sequential data assimilation method. The difference Is 
that the KB filter is based upon the correct evolution of the forecast 
error covariance matrix, which Is known by virtue of a 

stochastlc^ynamlc atmospheric model. 

We describe In Chapter 7 the statistical assumptions made In 01. 
Our Implementation of 01 will be based on that at the U. S. National 
Meteorological Center (NMC; Bergman, 1979; McPherson et al . , 1979) and 
at the European Centre for Medium Range Weather Forecasts (ECMWF; 
Lorenc, 1981). 

In actual practice, all data assimilation methods are 

Intermittent, rather than continuous. That Is, observational data are 
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grouped in Intnrvals centnred at th« synoptic tlnetf and aonatlnea at 
the tubsynoptic tines alsoi and data are asslnll.a.ted only at those 

tines. Asslnllatlon Is perfomed Internlttently for two reasons. 

Firsts Intemlttent assinllatlon Is nore compatible with dnta-handllng 
procedures such as croaa-chedcing for gross errors. Second* 
Intermittent assimilation allows Some time for the dispersion and 
dissipation of transient fast waves induced by each assimilation. 

Still* it is necessary to combine data assimilation with some form 
of initialization. The initialization methods which have dominated 
both research and practice In recent years, are the normal mode 
initialization methods. We describe them briefly here; see Daley 
(1981) for a complete review of these methods. For a review of other 
initialization methcds, see Bengtsson (1975). 

Normal mode initialization is based on writing the unforced 
forecast equations In phase space, as 

2 - - 1 Ajx ■+• El(j;*5) * (1.2a) 

z - - 1 A 2 Z + r 2 (y,z) ; (1.2b) 

jr(t) and z(c) are the slow and fast mode expansion coefficients* 
respectively; the Aj are constant, diagonal matrices, and the rj are 
nonlinear terms which depend on both slow and fast mode coefficients. 
The elgenfrequencles Aj^ are generally small compared to the 
elgenfrequencles A 2 , and the nonlinear terms are generally small 
compared to the linear terms. 

Linear normal mode Initialization was suggested by Dickinson and 
Williamson (1972); it was tested with a shallow-water equations model 


16- 


by WlXllaMon (1976)» and with a primltlva aquations model by 
Williamson and Dickinson (1976). In this method^ the "first guess" 
analysis vector provided by data assimilation Is adjusted by setting to 
sero Its fast components, while leaving Its slow components unchanged. 
1hus> If 2^(0) and z‘(0) are the modal coefficients of the analysis 
vector at time t *0, say, then the modal coefficients of the 
"corrected". Initialised vector from which the forecast proceeds are 
given by 


2®(0) - 2“(0) , zHO) - 0 . <1.3a,b) 

Were Eqs. (1.2) linear, the fast oscillations would thereby be 
eliminated for all time: z(t) S 0 If z(0) ■ 0 and £2 = 0. Ilie equations 
are not linear, however, and the nonlinear terms excite fast modes 
during the forecast. Indeed, at the Initial time we have 

whereby |(0) ^ 0, since 2 ®( 0 ) 0 generally. 

Nonlinear normal mode Initialization was Introduced by Machenhauer 
(1977), and Independently by Baer (1977) and Baer and Trlbbla (1977). 
This method attempts to eliminate the nonlinear excitation of fast 
modes by adjusting the Initial vector so that 

z(0) “ 0 . (1.5) 

That Is, the fast waves are required to be stationary at the Initial 
time. Ihe slow coefficients are not changed. 
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X®( 0 ) - X*(0) . (l.6«) 

To oatlsfy Eq. (1*5)» the feet coefflclante £*(0) eve dleeerdedi and 
replaced by 

zC(0) - - 1 A2^r2(x®(0),*«(0)) 5 (l.6b) 

a smallt balancing fast component la Introduced. In the Machenhauer 
Comulatlon, the nonlinear equation (1.6b) Is solved approximately > by 
one or two steps of functional Iteration. The Baer^Trlbbla method Is 
slightly different, and Is based on a careful nondlntenslonallzatlon 
with respect to the two time scales Involved. 

An extension of the nonlinear jn'cheroen, known as variational 
nonlinear normal mode initialization, has been developed by Daley 
( 1978 ). In this method, the fast coefficients are still required to 
satisfy Eq. ( 1 . 6 b), but the slow coefficients are now altered also, to 
reflect the relative accuracy of different atmospheric variables 
provided by the assimilation scheme. Thus, m discrete version of an 
error functional 

D - / [qu(u®-U^)^ + qv(v‘^-v®)2 +q^(^c_^aj2] (1,7) 

is minimized, subject to Eq. ( 1 . 6 b) as a constraint. Here u® and v® 
denote velocity components of the analysis vector, and ^® denotes the 
geopotential, while the minimization Is with respect to corrected 
values u*^, v®, ; the Integration Is carried out over the entire 

atmosidiere. The prescribed weighting factors q^ , q^ , q^ may vary In 
space and time; th^ reflect data density and accuracy. The weights 
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Aiir« taken to be large over data**denae reglonii for example^ ao r *;hat 
moat of the correction la over raglonl of epatae coverage where the 
analyaia vector la not likely to be very accurate. The variational 
approach can alao be applied^ of couree, to the linear Inltlallaatlon 
acheme. 

elow aanlfold concept of Leith U980) haa provided a fraiaework 
for understanding the behavior of normal mode Inltlallaatlon methoda. 
Ihe alow manifold approximates the set of slowly evolving solutions of 
the forecast equations; nonlinear normal mode Inltlallaatlon la viewed 
as projection onto the slow manifold. The nonvarlatlonal method 
corresponds to one type of projection. In the variational approach, 
the type of projection depends upon the weighting functions , q^ , 

The Rossby manifold Is the set of solutions of the forecast 
equations having a = 0; linear normal mode inltlallaatlon Is projection 
onto the Rossby manifold. For linear models, such as the shallow*^ater 
equations model we will work with, the Rossby manifold and slow 
manifold coincide. For linear models, the slow, Rossby manifold la In 
fact a subspace , l.e. , linear combinations of slow solutions are also 
slow solutions. 

We will see in Chapter 5 that the modified KB filter corresponds 
to combining the standard KB filter with variational linear normal mode 
initialization. That is, the standard KB filter can be viewed as the 
underlying assimilation scheme. With the KB analysis vectors being 
projected onto the slow-wave subspace. Essentially, we prove 
rigorously that this is the best that can be done for linear models. 
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For the leodified KB filter, one nuiit itlll ohooie the vecietional 
weights which define the type of projection* Ve will show thst, In 
addition to id.tiilmlslng the functionel (1*7), the isodlfled filter also 
mlnimlces the expected value of 

n' - / rq„<u<=-u')2 + ,^(vC-vt)2 + dA , ( 1 . 8 ) 

where v^, are components of the "true" atnospherlc stattf 

governed by a stochastic-dynamic model; n' le a functional of the 
actual error of the Initialized state* In our numerical experiments, 
we therefore minimize the expected value of the total enerxy of the 
error, l.e., we choose constant weights , q^ . the theory, 

however. Is developed for the most general quadratic error functional. 
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CHAPTER TWO 

ESTIMATION THEORY AWD METEOROLOGICAL DATA ASSIMIUTION 

In thla chapter* we review aome of the aapecta of eatlMtlon 
theory lAtlch apply to the problem of aaalmllating meteorological 
obaervatlofai . The theory la preaented only for the caae of 

dlacrete-tlme* linear ayatems. Jazwlnakl (1970) rlgoroualy treata both 
nonlinear and linear eatlmation theory* and Davla (1977) glvea an 
elegant account of the linear caae* An elementary treatment la 
available In Gelb (1974). 

2.1. The Stodtastlc-Dynareic Models 

In the application of estimation theory to data asalmllatlon* the 
atmosphere Is assumed to be governe;,^ by a atochastic'-dynamlc model 
which Is a randomly perturbed version of a given NWP model. The random 
perturbation Is meant to account for the discrepancy between the 
evolution of the actual atmosphere and the evolution described by the 
given forecast model.. The observation process Is represented by a 
second stochastic*^ynamic model: the observations are considered to be 
noisy "output" of the stochastic-dynamic atmospheric model. The 
assumptions on which the stochastic-dynamic models are based lead to a 
statistically optimal assimilation scheme, and to a method for 
assessing the performance of alternative schemes. The 
stochastic-dynamic models are described In this section* and their 
ramifications are discussed In subsequent sections. 
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We present only e slisple speclel cese of the theory: the given 
foreceet aodel is linear • and hence does not represent an actual NWP 
model* The forecast model Is expressed symbolically as 

2k - '^k-l 2k-l » <2-D 

for values of the discrete tine k 1»2*3*.** . The vector Wj^ has 
dimension n and the dynamics matrix Is nxn, where n Is the number 
of degrees of freedom of the model: n Is as large as 10^ for actual NWP 
models* Interpreting Eq. (2*1) as a finite-difference model, the 
components of Wj^ approximate at time k the values of the atmospheric 
variables at each grid point; consists of finite-difference 
coefficients and advances the forecast from time k-1 to time k* Forcing 
terms are omitted from Eq. (2.1) for simplicity. 

The model (2.1) Is linear: H* does not depend on w* The linearity 
assumption leads to substantial simplification of the theory reviewed 
In this chapter and extended in Chapter 5. Linearity does not obscure 
the main phenomenon of Interest: linearized NWP models have solutions 
which ii^^olve on two time scales. 

Given the forecast model (2.1), It Is assumed that the true 
atmospheric state evolves according to the stochastic-dynamic model 

2k “ '^k-l2k-l + fek-1 • <2-2a) 

The superscript denotes that the state vector w^ I'; an n-vector of 
the true , but unknown, values of the atmospheric variables. The random 
perturbation, or system noise , is the n-vector ]^_i* In general, It Is 
supposed to account for dynamical and physical processes Improperly 
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■odeled by the forecast equations (2.1), as well as for truncation 
errors due to discretization* Since the dynamics of Eq* (2^ 2a) are 
linear^ we will want the noise to account In particular for thu 
nonlinear effect of unpredictability. 

The sequence {]^: k - 0|^l,2t***) Is assumed to be a white noise 

sequence with mean zero and known covariance matrix 

®fek " " \^kt* (2.2b,c) 

The operatp'? E indicates the expected value, or ensemble average, the 
superscript T denotes the vector or matrix transpose, and the symbol 
is the Kronecker delta, ■ 0 If k 1 and ■ 1 If k ■ 1. The 
mean-zero assunption (2,2b) Is made for cenventenee^ and Is not 
essential to the theory. 

Having described the state model , Eqs, (2,2), we now describe the 
observation model . Suppose that a vector of observations w^ Is 
available at a given time k. It Is assumed that the observations can be 
modeled by the equation 

2k-HkHk+U^ <2 -3a) 

The dimension p of the observation vector Is the number of 

measurements available at time k; p ■ p(k). The observation matrix 
Is a nonrandom pxn matrix, and the observation noise b^ is a random 
p-vector. Typically p << n. 

The elements of the observation vector are the raw observational 
data themselves. We merely assume that they are related to the true 
atmospheric state, and hence to Eq, (2,2a), by Eq, (2,3a), The 
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obiervation model is assumed to be linear: doos not depend on w^* 

In other words* Eq. (2«3m) models noisy observations of linear 
combinations of elements of the state vector* 

The linear combinations correspond to the fact that the 
observation matrix &ust Interpolate from variables defined at grid 
points (the elements of w|^) to variables defined at observation 
locations (the elements of w^). The two sets of variables need not 
represent the same meteorological fields. For example* part of the 
observation matrix could contain linear regression coefficients for 
converting between temperature components of w^ and satellite-measured 
radiances In w^. 

The random vector models observational error, which Includes 
both instrumental error and the sampling error Inherent in point 
measurements of fields with considerable spatial variability. The 
observational noise Is assumed to be white* with mean zero and known 
covariance matrix Rj^; 

- 0 * E(bg)(bg)T - , (2.3b,c) 

and Is assumed to be uncorrelated with the state noise: 

Assumptions similar to (2.3b,c*d) are also Implicit In the data 
assimilation schemes In use at NMC (Bergman* 1979) and at ECMVfF 
(Lorenc* 1981). 
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The fltochastic-dynamlc modeli (2*2» 2.3) expctii the basic 

assumptions Involved In our application of estlnatlon theory to NWP* 
Next, ve Introduce the class of data assimilation methods to be 
cofisidered, and then we discuss the Implications of the models for the 
data assimilation problem. 

2.2. Unbiased Linear Data Assimilation Methods 

Suppose that at some time k**!, observational data have been 

used to provide an estimate of the true atmospheric state w^_i. 

The superscript ^ Is used for the estimate because the estimate Is 
known in meteorological practice as the analysis vector ; It represents 
an "analysis" of the observations available at time k~l. The analysis 
vector Is assumed to be unbiased , l.e. , 

®^2k-l ■ 2k-l) ■ 0* 

Like the state vector, the analysis vector has dimension n. 

The purpose of a data assimilation scheme Is to combine the 
estimate with new observations w^ which become available at time 

k, to produce a new estimate We consider only unbiased , linear 

assimilation methods ; a new unbiased estimate la sought as a linear 
combination of the old estimate and the new observations, l.e., 

E(^ - wj) - 0 , (2.5) 


2k “ K-l2k-l %2k 


( 2 . 6 ) 
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The Matrices and are nonrandom and have dimensions nxn and nxp» 

respectively. 

The assumptions Inherent In the stochastic-dynamic models sjctually 
determine Stid allow Eq. (2,6) to be written In a more 

Intuitively appealing way. Substituting Eq. (2.6) Into Eq. (2.5), 
and using Eq 8. (2.2ayb)y (2.3a,b) and (2.4)| one finds that 

0 ■ - Sk) ■ IW-1 - <1 - KkHkMk-llSxE-l- 

since t* 0 generally, It must be that 

Vl - (I - KkHk)4'k.i . 


80 that Eq. (2.6) becomes 

Hk - H'k-iwg-l + Kk(wg - HkH-k-lHg-i). 

Now ’•'k-lHk-1 Just a one-step prediction from time k-1 to time 
k, cf. Eq. (2.1), Defining the forecast vector w^. 


SJk " '**k-lHk-l » 


(2.7a) 


we therefore have 


Sk ■ sj + >^<s£ - «kSk) i 


(2.7b) 


cf. Eq. (l.l). Ihe forecast and analysis vectors, given by Eqs. 
(2.7a,b), are both estimates of the atmospheric state. The forecast Is 
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• fir»t gue»« ; the analysis Is presumably a bettcit estimate since It 
Incorporates the new observational Information. 

Eqs. (2.7) represent the general form of all linear » unbiased 
data assimilation schemes. With Eq.(2.7a) replaced by a fully 
nonlinear NWP model, all statistical assimilation schemes used at NWP 
centers can be written In the form (2.7); see, for example, Gustavsson 
(1981). 

The general form of the scheme (2.7) states that the analysis 
vector Is the sum of the forecast vector and a linear combination of 
the elements of the observed-mlnus-forecast residual wg - The 

gain matrix Kj^ specifies the linear combination, and therefore 
characterizes the assimilation method. We take the dynamics matrix 
and the observing pattern to be given, and focus attention on how to 
specify the gain matrix. 

If there are no observations available at some time k, then the 
second term on the right-hand side of Eq. (2.6) Is not present; 
equivalently, ■ 0* In this case, Eqs. (2.7) become 

Hk “ '‘'k-lHk-1 * (2.8a) 

2k " Hk *• 

l.e. , the forecast simply proceeds when there are no data to be 
assimilated. 

In particular, Eqs. (2.8) hold for all k > N, where N Is the most 
recent time at which observations are available. The assimilation 
scheme is provided an unbiased initial estimate w^ , data are 
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•ssimllated at times k ■ and then a forecast Is Issued, using 

as Initial data# We are concerned primarily with the case k < 

2»3. Assessing the Performance of Data Assimilation Methods 

One of the main advantages of assuming the stochastic-dynamic 
models (2.2) and (2.3) Is that they lead to a method for assessing the . 

performance of data assimilation methods of the form (2.7). That Is, 
for any choice of gain matrix sequence {K^^: k - 1,2,3,...}, It can be 
determined bow well the corresponding estimates {w|^, w^; k ■ 1,2,3,...} 
represent the true states (wj^t k - 1,2,3,...}. 

To show hew this Is so, we Introduce the estimation error 

I 

covariance matrices . These are the forecast error covariance matrices , | 

defined by | 

“ E(w{^ - w^)(wj[ - w^)'*^, (2.9a) 

I 

and the analysis error cova rlance matrices , defined by 

^k " E(wg - 2k><Sk ■ 2k)^* <2. 9b) 

I 

The forecast and analysis error variances , which are the primary | 

measures of an assimilation scheme's performance , are located along the 
main diagonals of P|^ and P^, respectively. 

From Eqs. (2.2) and (2.7a), It follows that P^_j Is advanced by 
one time step to P^ according to 

I i 

i 

ffc - + Qk-l • I 

1 I 

f , , ■ 

f ! 

I i 

L . i 
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whlle Eqs. (2.3) and (2.7b) inply that P|[ la found from by tha 
fornula 


Pg - (I - K^^HJ^)P{(1 - KfeH^)'^ + (2.10b) 

The estimation ertor variances and covariances can therefore actually 
be computed, provided that Fg is known. Eqs. (2.10) will be used 
repeatedly In our numerical experiments, to compare the performance of 
data assimilation schemes based on a variety of choices of the gain 
matrix sequence. 

The terms In Eqs. (2.10) have a simple Intuitive Interpretation. 
The first terra on the right-hand side of Eq. (2.10a) determines how 
estimation errors are advected between data-dense and data-sparse 
regions from one time step to the next. The numerical experiments 
reported in Chapters 6 and 7 indicate that the effect of advectlon of 
Information Is Important In data assimilation. The second term In Eq» 
(2.10a) Is due to the presence of system noise, and results In a 
tendency of /Inear growth of estimation error variance. 

The first term on the right-hand side of Eq. (2.10b) determines 
the extent to which new observational Information Improves the 
forecast, and the second term Indicates the deterioration due to 
observational error. For a data asilmllatlon scheme to perform 
properly. Its gain matrices must be such that the error reduction 
given by the first term dominates the error growth given by the second 


term. 
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The sltBple form of Eqs* (2. 10) It due prlntrlly to the attumed 
linearity of the itochaatlc-dynamlc models* For nonlinear models, 
equations like (2*10) Still hold but are more complicated (Jaswlnskl, 
1970, Sec. 6. A); the right-hand sides depend on higher-order moments 
of the estimation errors. 

Still, the computational task Involved In advancing Eqs* (2*10) 
exactly Is laborious; matrix-matrix operations mist be performed at 
every time step, regardless of how often observations are assimilated* 
For this reason, our numerical experiments will be performed with a 
simple model Involving only n - A8 state variables* 

2* A* Derivation of the Kalroan-Bucy Filter 

Eqs. (2.10) give the estimation error variances corresponding to 
any choice of the sequence of gain matrices* Hence, a sequence which 
minimizes the variances can be determined* The assimilation method 
based on the ralnimlzlng sequence Is called the Kalman or Kalman-Bucy 
(KB) filter , after Kalman (1960) and Kalman and Bucy (1961), who first 
formulated It for processes governed by linear systems of ordinary 
differential equations* 

Before deriving the KB filter, we review some facts from linear 
algebra. All vectors and matrices in the following discussion are 
real. 

A square matrix A is symme trie if A*^ ■ A. An tm<m matrix A is 
positive definite If it is symmetric and If x"^Ax > 0 for all nonzero 
m-vectors x. If A Is positive definite, then A Is nonsingular , l.e* , 
A~^ exists . Every positive definite matrix A can be factored 
(nonuniquely) as A ” AjAj, where Aj Is nonsingular. 
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An nKm maitrlx A Is posltivs stmidefinlts it it Is sysssstric and it 

x^Ax ^ 0 for all m'^ectors x. Every positive semldefinite guitrix A can 

be factored (nonunlquely) as A • AJAj.) where Aj may be singular. 

^ covariance matrix A is a matrix of the form A ■ where y is 

a random m-vector with ■ 0. All covariance matrices are symmetric 

and positive semldeflnlte. If every nontrivial linear combination of 

m 

the elements of ^ has positive variance » l.e» , if e( x^yj^)^ > 0 for 

_ 1*1 

all nonzero ro-yectors x, then A ■ Eyy* is actually positive definite. 

The trace of a square matrix A Is the sum of Its diagonal 
elements. The trace operator has the properties’ 


trace * trace A, (2.11a) 
trace (A+B) • trace A + trace B ^ (2.11b) 
trace AB ■ trace BA , (2.11c) 
trace xy^ “2^2 » (2. lid) 
trace BB^ > 0, (2. lie) 
trace ■ 0 if and only if B ■ 0, (2. Ilf) 


for all roxra matrices A,B, and for all nrvectors x. y. 

We now give an elementary derivation of the KB filter, based on 
minimizing a quadratic functional of the analysis error. Let A be an 
arbitrary nonrandom positive semldeflnlte nxn matrix, and let 

“ EHgt- wj)’' A(w^ - w^)]. (2.12) 

The functional Is a general measure of the analysis error. The 
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functlonil could represent the total expected kinetic energy of the 
enalysls error , for example^ In which case A would be a diagonal satrlx 
with appropriate weights for the nass and wind fields along Its 
diagonal* Weighting matrices of Interest are usually positive 
definite, but we allow A to be positive aemldeflnlte to Include the 
possibility that one might not be concerned with the error In one or 
more meteorological fields at one or more grid points. 

Eq. (2.12) will now be rewritten to make the dependence of n^^ 
upon K|^ explicit, and then will be minimised with respect to K|(. 
First, notice that 

n * E{traee[A(^ ^ (2.13a) 

■ trace AP® ; (2.13b) 

the first equality follows from property (2*lld), while the second 
follows fron the fact that the expectation and trace operators commute, 
from the nonrandomness of A, and from the definition of P®. The 
subscript k has been dropped In Eqs. (2.13), and will be omitted 
throughout the derivation. It Is Implicit that k Is a time at which 
observations are available, for otherwise Kj^ ■ 0; cf. Eqs. (2.8). 

To find a suitable expression for P® to be Inserted Into Eq. 
(2.13b), we expand the products In Eq. (2.10b), to get 

P® - KCK'*' - KHP^ - P%TrT + pf ^ (2.14a) 


where 


(2.Ub) 
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C - HP^rT + R . 

The matriix C !• •yminetrlc, tlnce and R ara covariance Mtrices* 

We aaaune that all nontrivial linear conblnatlona of the elements 
of the obaervatlonal nolae b^ have nonzero variance » l*e« , no linear 
combination of the meaaurementa is "exact". The obaervational error 
covariance matrix R la therefore poBltiire definite; it follows that C 
la also positive definite, and hence nonsingular. Consequently, one 
can conq>lete the square in K in F.q. (2.l4a), to get 

P* - (K - P^HTc”i)C<K - P^hTc“^)T + Z, (2.15a) 


where 

Z ■ pf - pfHTc-lRpl. (2.15b) 

Notice that Z is Independent of K. 

Finally, substituting Eq. (2.15a) into Eq. (2.13b), we have 

n - trace [A(K - P^hTc"1)C(K - P^h'*^C‘"1)T + AZ] . (2.16) 

Factoring A and C as A - a|Ai , C - CiCj, and using properties 
(2.11b,c), Eq. (2.16) becomes 

n ■ trace + trace AZ , (2.17a) 


where 
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B - Aj(K - p'hTc-1)Cj. (2.l7b> 

Since AZ does not depend on K» end due to propeirttes <2.1ie|f)f It 
followe fron the representetlon (2»17) thet n I* ninlMted with respect 
to K If end only If B ■ 0» l*e*» Iff 

Ai(K - P^hTc-1)Cj » 0. (2.18) 

Therefore, a gain matrix which minimizes n la given by 

K - P%Tc-l. (2.19) 

Moreover, the mltilmtztng gain matrix is unique If the weighting 
matrix A, which was assumed to be only positive semldeflnite, is 
actually positive definite. If A - A]'Aj la positive definite, then Aj 
is nonsingular; C C|C][ is already known to be positive definite, so 
Cj Is nonsingular. Premultiplying Eq. (2.18) by Aj^ and 
post mult ip lying by yields Eq. (2.19); l.e., the minimizing gain 

matrix is unique. 

The gain matrix given by Eq. (2.19) is the Kalman or Kalman-Bucy 
(KB) gain matrix , which we denote by K^. Using Eq. (2.14b) in Eq. 
(2.19), the KB gain matrix is given for each observation time k by 

k|^B . pfHj(\pJfHj + Rjt)“^» (2.20a) 

We have simply 


KjJB - 0 


(2.20b) 
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If thfete ttt no obitrvAtlont «t: fcliM k* 

Requiring rij^ to be nlnlmlzed euccesslvely) et each tine k » 
reeults In the KB gain natrlx aequenoe k " 1,203,..*}; 

thM Kalwan or Kalaan-Bucy filter la the cotreapondlng data aaeladlatlon 
nathod. Since dependa upon the eatlnatlon error covariance 

matrlcea C2«i0) muat be computed during the aanlmllatlon. When K|^ m 
however, It followa readily from Eqa.(2»15, 2*19) that the general 
formula (2«10b) alnpllflea to 

Pg - (I - kPh^)pJ, (2,21) 

Eqst (2*7, 2*l0a, 2*20, 2.21) constitute the Kalman filtering 
algorithm for discrete-time, linear ayatems. We recapitulate the 
equations here for convenience: 


2k - '‘'k-iH^-i * 

(2.22a) 

H - tic-iPt-iVj-i + Qk-1 , 

(2.22b) 

K]J» - PM(»kPM + , 

(2.22c) 

pg - (I - , 

(2.22d) 

2k " 2k “ ^k2k) * 

(2.22e) 


for k - 1,2,3,..., given ^ and Pq. In the absence of observations at 
time k, Eq. (2.22c) Is replaced by 0, and Eqs. (2.22d,e) 
simplify accordingly. 

Notice that Eqs. (2.22b, c,d) do not depend on the estimates 
provided by Eqs. (2.22a,e). This Is a consequence of the linearity of 
the stochastic-dynamic models. Given Pg and the dynamics the gain 


aatrix lequence depends only on the observing patterns and noise 
covariance aa trices. If these are known In advance, then the gain 
aatrlx sequence can be computed before the asslallatlon begins* 

2.5. Optimality Properties of the Kalman-Bucy Filter 

We now discuss some of the optimality properties of the KB filter. 
Notice, first of all, that K*® Is actually Independent of the weighting 
nx^trlx A which defines q: A does not appear, explicitly or Implicitly, 
InEq. (2.20a). In other words, simultaneously minimizes all 

positive semldeflnlte quadratic functionals of the analysis error, and 
uniquely minimizes all positive definite quadratic functionals of the 
analysis error. 

Suppose In particular that A ■ ®jej, where ej is the column of 
the nxn Identity matrix. Then, according to definition (2.12), n Is 
jtist the variance of the component of the analysis error. 

Consequently, since j is arbitrary, K*® minimlzcss the analysis error 
variance of each meteorological variable at each grid point* The KB 
filter Is a ml nlmum- va rlance estimator . 

Suppose we fix a time k ■ A. It was sho?m In Sec. 2.4 tViat 

requiring to be minimized, in turn, for each time k » 1,2,. ..,A, 

results In the KB gain matrix sequence {K^®: k « 1,2, ...,&}. Suppose 
that instead we wish to minimize only without regard for the values 
of Tifc at the previous times k - 1,2,...,A-1. In other words, is to 
be minimized with respect to the entire sequence K^ .k^ . . . . .K^^ . The 
result is still - K^® , k - 1,2,..., A. That is, the KB filter Is 
time-optimal ; the minimum value of for each fixed A, Is attained by 
using the KB gain matrix sequence. 
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A slightly more genersl ststemsnt of this fact Is the following* 
Asslnllstlon schemes of the form (2*7) are sequential , or recursive ! 
the only observations upon which the current analysis vector w^ 
explicitly depends are the current observations w^, so thqt 
observations may be discarded as soon as they are processed. Suppose 
that we consider a more general class of assimilation schemes, In which 
each analysis vector Is explicitly a linear combination of all 
available data: 


k 

2k “ ^k ,0 Ho ^k,j Hj * 


k ^ 1,2,3,... 


(2.23) 


It can be shown (e.g., Jazwlnski, 1970, Sec. 7.3), that In fact the KB 
filter Is optimal among assimilation schemes of this more general 
class: with given by Eq. (2.23), minimizing either for all k or 
for a fixed k still results in the KB filtering algorithm. 

This wider optimality is due to assumptions (2. 2c, 2. 3c, d) that the 
system noise and observational noise are uncorrelated In time. If the 
system noise and observational noise are also assumed to be Gaussian, 
it is known (e.g., Jazwlnski, 1970, Sec. 5.2) that the optimal 
nonlinear assimilation scheme, i.e. , one In which w^ might depend 
nonlinearly on all past and present observational data. Is still the KB 
filter. 

2.6. Further Remarks on Estimation Theory 

By applying the theory outlined in Secs. 2. 1-2.4 to a simple 
model ¥ based on the linearized shallow-water equations, we will see in 
Chapters 6 and 7 that much can be learned about the properties of data 
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•••inllatloii schemes In a meteorologically familiar* but somewhat 
Idealized setting. The method of Sec. 2.3 provides a way to determine 
how well any scheme of the form (2.7) performs* and for comparison we 
have an "optimal" scheme* namely the KB filter. For completeness* we 
now discuss some eKtenslons of the rudimentary theory we have outlined* 
Which might lead to practical assimilation schemes for global HWF. 

First* notice that the sequential nature of the Kalman-Bucy filter 
Is made possible by the fact that the forecast error covariance matrix 
known at each time k. All observational Information avallaible 
prior to time k Is embodied in the forecast veocor w|^ and the 

covariance matrix * so that the only additional Information needed 
at time k Is the current observational information (w^»Hj^,Rj^)* cf. 
Eqs. (2.22c*e). That P|^ Is known Is a consequence of the fact that 
the system noise covariance matrix Qj^ was assumed to be known, cf. 
Eqs. (2.2c* 2.22b). A priori knowledge of Qj^ is, however* not 

essential: Qj^ can be determined adaptively* l.e. * during the 

assimilation process Itself (Belanger* 1974; Ohap and Stubberud* 1976; 

Clitn* 1979; Maine and Illff, 1981). The observational noise covariance 
matrix Rj^, as well as the means of the observational and system noise, 
can also be determined adaptively* 

For a realistic NWP model, the corresponding stochastic-dynamic 
model (2.2a) would be nonlinear. Estimation theory still leads to an 
optimal assimilation method In this case (e.g. * Jazwlnskl* 1970* Ch. 
6). However* as the right-hand sides of the equations corresponding to 
(2.10) Would depend on all the higher-order moments of the estimation 
error* approximations would be required to make the algorithm finite* 


-38- 


Many approxlinate nonlinear filters have been fornulated 
(Jazwlnskl, 1970, Ch. 9, and references therein). A particularly 
simple nonlinear filter, often used In engineering applications. Is the 
extended Kalman filter (EKF; Jazwlnskl, 1970, Sec. 8.3j Gelb, 1974, 
Ch. 6). The EKF Is essentially the usual linear KB filter, with the 
nonlinear dynamics being linearized about each successive state 
estimate. Discussion of the applicability of the EKF to numerical 
weather prediction appears In Ghll et al. (1981, Sec. 5). 

Even for linear stochastic-dynamic models, the KB filter presents 
a formidable computational task. The matrix-matrix operations In Eq. 
(2.2^b) nust be performed et every time step; Eqs. (2.22c,d) are 
needed only when observations are available. Assinilatlofi Schemes In 
current use require only matrix-vector operations (2.8) In between 
observation times. There are, however, a variety of ways to 
reformulate Eqs. (2,22) for computational efficiency (Blerman, 1977; 
Paige and Saunders, 1977). In particular, the matrix Inversion in Eq. 
(2.22c) can be avoided. 

Still, for global NWP, It would probably be necessary to calculate 
Eq. (2.22b) In an efficient approximate form. Many approximate forms 
are possible. In fact, the '^optimal Interpolation" (01) methods In 
operational use at NWP centers can be regarded as being bao<^d on such 
an approximation. In Chapter 7, we describe the approximate version of 
Eq. (2.22b) upon which 01 Is based, and we Implement 01 for our 
dynamical model f . The method of Sec. 2.3 Is used to determine the 
effect of the approximations involved In 01. 
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In sunmry, the framework of estimation theory Is well-suited to 
study the prohleros of meteorological data assimilation, ajid we have 
Indicated how the theory outlined In Secs. 2. 1-2. A might lead to 
practical assimilation algorlth*^ for global NWP. 

Our purpose here Is to develop the theory In a different 
direction, l.e. , to account for the Initialization aspect of data 
assimilation. We show In Chapter 5 that for dynamical models having 
two time scales , requiring the state estimates to evolve slowly leads 
to a modified version of the KB filter. First, In Chapter 3^ we 
describe the linearized shallow-^ater equations and their slow 
solutions, and then In Chapter 4 we describe the slow solutions of a 
corresponding discrete model y. 
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CHAPTER THREE 


(I 


SLOW SOLUTIOWS OF THE LINEAR SHALLOW-WATER EQUATIONS 

The shallow-water equations govern the motion of a thin layer of 
incompressible, invlscid fluid over a given surface* The shallow-water 
aquations over a rotating sphere give a simplified description of the 
dynamics of the earth's atmosphere, and solutions of the equations 
exhibit many Important properties of large-scale atmospheric flow 
(e.g., Pedlosky, 1979, Ch* 3). In particular, the equations possess 
both slowly evolving solutions and rapidly evolving solutions. 

In this chapter we describe the slow-wave subspace of a linear , 
spatially one-dlmenslonal version of the shallow-water equations* The 
slow-wave subspace is the set of all initial data which lead only to 
slowly evolving solutions of the linearized equations. 

3*1. The Equations 

The linear, spatially one-dimensional shallow-water equations, 
written in cartesian coordinates for a plane tangent to the earth at 
latitude 0Q, are given by 


+ Uux + 4x ■ 

fv ■ 0 , 

(3.1a) 

+ Uvx + 

fu ■ 0 , 

a*lb) 

+ U 4 .X + *Ux - 

fUv • 0 . 

(3*lc) 


The coordinate x pol^'ts eastward, in the zonal direction, along the 
circle of latitude 6 •• 6q, while y points northward, in the meridional 
direction; u and v are velocity components in the x and y directions. 
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respectively* The geppotentlel ^ • gh measures the deviation of the 

* 

height H+h of the free surface from Its equilibrium value H; • ■ gH Is 
constant and g Is the acceleration due to gravity. The constant U Is 
the mean zonal velocity and f - 2(2 sin 6 q i# the Coriolis parameter* 

with (2 the angular velocity of the earth. The subscripts x and t 

denote differentiation with respect to x and the time t; all quantities 
are Independent of y. 

These equations are derived from the full* nonlinear shallow^ater 
equations on a tangent plane* 

u^ + uujj + vuy + 4>x ■ 0 * (3* 2a) 

^t '^'^y + 4>y + fu “ 0 ♦ (3.2b) 

♦ t + ^4>y + ♦('*x‘^ ^y) " 0 » (3.2c) 

by linearization around the solution u“U* v»0*4 ■ *(y) satisfying 
fU ■ - *y M const. The quantities (u*v,(|i) in Eqs. (3.1) are 
perturbation quantities* or departures from the equilibrium values 
(U*0**), while in Eqs. (3.2) they denote the total amplitudes. The 
derivation of Eqs. (3.1) from Eqs. (3.2) Is based on the assumption 
that the perturbation quantities do not depend on y. 

The parameters f * U and ♦ In Eqs. (3.1) will be chosen to 
correspond to midlatitude flow* ~ A5°N. An Important feature of 
large-scale midlatitude atmospheric dynamics Is the approximate balance 
which exists between the pressure-gradient force and tlie Coriolis force 
(e.g., Holton* 1972* Sec. 2.4). Atmospheric sta<r^;is In which these two 
forces are exactly balanced ere called geos trophic . In the nonlinear 
system (3.2)* the pressure-gradient terms are and ^y 


f 


and the 
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Corlolis terns are -£v and £u ; geostrophlc atatea o£ thla ayatem are 
thoae £or which u - - +y/f and v - ♦x/^» Notice that the aolutlon 
about which E<i8* (3 *2) were linearized ia geoa trophic^ 

Since X is the coordinate along a circle of latitude^ the 
appropriate boundary conditions for Eqs* (3*1) are periodic: u(x+2L»t) 
■u(x,t), and slnilarly for v and where 2L ia the earth's 
circumference at latitude For reasons described in Sec. 6.1, we 

seek only two-psriodlc solutions of the equations. We therefore inpoae 
the boundary condition 

yj("^»t) "wC-j-,!), (3.3) 

where w(x,t) - lu(x,t),v(x,t),«|i(x,t)]'*^, and we solve the system 

(3. 1,3.3) in the spatial domain -L/2 ^ x ^ L/2. 

Corresponding to latitude Qq- 45°N, we tdce f ■ 10"^ sec“^ and L ■ 
14000 km for our system (3. 1,3. 3). The mean zonal current is taken to 
be U *• 20 ms“^, which is typical f<.r mld-tropospherlc flow at this 
latitude, while 4 ■ 3xloV^s“^, which corresponds to an equivalent 

depth of H = 3 km for a homogeneous atmosphere. 

The initial-value problem for the hyperbolic system (3.1), with 
boundary condition (3.3), Is well-posed (e.g., Courant and Hilbert, 
1962, Sec. 5.6). That Is, for arbitrary Initial data w(x,0) which 
satisfy the boundary condition and have continuous first derivatives, 
there exists a unique solution w(x,t) of Eqs. (3. 1,3.3) and the 
solution depends continuously on the initial data. If the initial data 
are j times differentiable, then all order partial derivatives of 


the solution exi st 
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Slnce th<i equAtlpns ace linear and have conitant coefflcienta, Che 
solution corresponding to any legitimate choice of initial data can be 
expressed as a Fourier series. We shall do S 0 | and then use the result 
to determine the slow-wave subSpace. Slowly evolving atmospheric 
states are approximately geostrophlci or quasigeos trophic (e.g., Leith, 
1980). We will see that slowly evolving solutions of our linear system 
are also quasi geos trophic. 


3.2. Fourier Series 

First, we review some of the properties of Fourier series; f'^r 
further reference see, e.g. , Churchill (1969). Let |(x) be a vector or 

T r 

scalar function defined for x e [- . Which is coitlnuous there, and 

which satisfies g(-L/2) ■ g(L/2). The Fourier coefficients g(5 ) of 
g(x) exist and are defined by 

1 

g(5) ■ Y" / e”^^* g(x) dx, (3.4a) 

-L/2 

for 5 twanging over the discrete values 


5 


5(w) 


- 2r 

T 


to I 


(0 


0 ,± 1 ,± 2 ,...; 


(3.4b) 


5 Is the spatial frequency and w is the wave number. 
understood that the variable C takes on only the 
defined In Eq. (3.4b). 

If g(x) Is real. It follows Immediately from Eq. 


Henceforth It Is 
discrete values 

(3.4a) that 


g (-0 “ 6(5 ), 


(3,5a) 
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for all C, where the overbar denotes complex conjugation* In 
partlculari 


g(0) is real ; 


(3.5b) 


g(0) Is just the average value of g(x). 

If g(x) Is also differentiable on [-L/2,L/2], then the Fourier 
series representation of g converges polntwlse to g on [-L/2,L/2). 
lhat Is , 

g(x) "I e^^* ’4<0. (3.6) 

S 

for all X e [-L/2,L/2], where g(5 ) is given by Eq. (3.4a). Ihe symbol 
indicates summation over the discrete values of ^ given by Eq. 
(3.4b). 

L *L 

If, In addition, the second derivative exists on and 

if ■ gx(i^/2), then the series obtained by termwise 

differentiation of the series in Eq. (3.6) converges polntwlse on 
[-L/2,L/2] to That Is, 

gx(x) - I e^^* ix(C^» (3.7a) 

K 

T T 

for all X e l”y»yl , where 


gx(5) - Ug(C). 


(3.7b) 
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3»3« Solution of the Initial-Value Problem 

Let L be the eet of real S^ectore ^(x) which are 

twice-dlfferentlable on [-L/2,L/2] and which aatlefy |(-iy) ■ (((y) end 
^ consider for Eqs. (3*1) only those Initial data 

which lie In i* 

Suppose Initial data w(x^O) e L are given. According to the 
well-posedness discussion In Sec. 3.1| a corresponding unique solution 
w(x,t) exists^ and w(x,t) e L for each t > 0. Eqs. (3.4-3. 7) therefore 
apply to w(x,t), from which It follows that 

w(x,t) ■ e^^*w(C»t)» (3.8) 

5 

for all xe where, corresponding to Eqs. (3.1) » the Fourier 

coefficients w(^,t) satisfy the ordinary differential equation 


^w(5,t) - 1G(?) w(5,t), 


(3.9a) 


for all ? , and G(C) Is given by 


G(e) - - 


eu 

If 

e 



HLf 

eu 

0 

• 

(3.9b) 

I — ~ 

IfU 

eu 



solution 

H(x* t) 

in 

terms of the Fourier 


To express tb 

coefficients w(^ ,0) of the initial data, It remains to solve Eqs. (3.9) 
and substitute the result Into Eq. (3.8). The solution of Eqs. (3.9) 
is simply 


w(C ,t) - w(c .0) ; 


(3.10) 


•«e* for example, Coddington and Levinson (1955, Sec. 3.4). 

Equation (3.10) can be expressed more conveniently by expanding 
w(C ,0) In terms of the eigenvectors of G(^). I/tt X|^(C), i ■ -1,0,1, be 
the eigenvalues of G(^ ), with corresponding eigenvectors 3j((C ): 

m Hi <5 ) ‘ h > U > • ( 3 . 1 U) 

for t - 0,±l, and for all 5. The matrix has the same 

eigenveetora, but with eigenvalues 

elG(5)t (3.Ub) 

for i - 0,±l, for all 5, and for all t _> 0. 

It will be shown In Sec. 3.5 that the three eigenvalues Xj^(g) 
corresponding to each 5 are real and distinct. Given that the 
eigenvalues are distinct. It follows that each triplet of eigenvectors 
Is a linearly Independent set. Each Fourl&r coefficient w(5 ,0) 
therefore has a unique e}q>anslon In terms of the eigenvectors: 

w(f,0) - I 0^(0 ai(0, (3.12a) 

I 

for some scalars Oj^(^), where the sunmatlon runs over the values i ■ 
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Jt follows from Eqs* (3*llb» 3*12s) that Bq. (3*10) can ba 

written 

s«.t) - I «,(0 »,«) (J.12b) 

I 

Substituting this result into Eq. (3.8) yields the Fourier series 
solution of the initial-value problem for the system (3. 1*3.3): 

»<x,t) . I I a,(t) n(5) (3.13) 

5 * 

The solution is a superposition of plane waves (e.g. * John* 1978* Sec. 
5. 2d). 

3.4. The Slow-Wave Subspace 

From Eq. (3.13) it is clear that the eigenvalues 1^(5) are also 
the eigenfrequencies of the solutions of Bqs. (3. 1,3. 3); l.e. * they 
are the rates at which waves of spatial frequency ? can evolve. For 
given i and elgenfrequency Xj^(5) is present in the solution if, and 
only if* the coefficient Ujj,(5) 0 in Eq. (3.13). The quantities 

Oj^(5) are the defining coefficients in expansion (3.12a): frequency 
Xj^(5) is present in the solution if* and only if* the Fourier 
coefficient w(5 ,0) of the Initial data has a component along the 
eigenvector 2^(5). 

As previously noted, it will be shown that, for each 5, the 
eigenfrequencles Xj^(^) are real and distinct. It will also be shown 
that one elgenfrequency, say Xq(^), in fact has magnitude much smaller 
than the magnitudes of the other two: 
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Uo<5>! << UtiOs)], o.u) 

for •ach Actually, bacauai of tha reality condition (3.5a), thia 
naad be verified only for 5 >. 0: according to Eq. (3* 9b), 6(-5) - 
•«(5), which Impllea that lj(-5) - -Xj^CS), ao that |Xj^(-e)| - |X^(0|. 

Slow aolutiona of ayatem (3. 1,3. 3) are those in which only the low 
frequencies Xq( 5) are present. The preceding dlacuasion makes It clear 
that for a solution to evolve slowly, it la necessary and sufficient 
that the initial data w(x,0) lie in the slow-wave aubspace which is 
given by 

Rc ■ ts(x) 6 w(C) - ao(0 ao(5) 

for some scalars ao(5)}, (3.15) 

The slow-wave subspace is the set of all initial data in f which lead 
to slowly evolving, Rossby wave solutions of the continuous , or 
differential, equations. 

It follows from the linearity of Eqs. (3. 4a, 3. 6) that all linear 
combinations of vectors from R^, He in R^,, so R^ is in fact a subspace 
of L, Furthermore, R^ is an invariant subspace of the solution operator 
of the system (3* 1,3. 3) j it follows from Eq. (3.13) that w(x,t) e R^ 
for all t > 0 if w(x,0) e R^.. 

3.5. The Elgenfrequencies and Phase Speeds 

Exact and approximate formulas for the elgenfrequencies Xj^(l5) will 
new be determined, in order to show that the elgenfrequencies are real 
and distinct, and in order to verify their relationship (3.14). The 
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•l(H»-wave eigenvectors £ 0 ( 5 ) will be determined afterwards, to make tbe 
definition (3«15) of the slow-wave subspace more expllelt. 

The elgenfrequencles are the eigenvalues of G(C), which Is given 
by Eq. (3«9b). Setting the determinant of II - G equal to zero gives 

(X-KU)2 - + f2)(X4?U) + f2eu - 0 , (3.16) 

Equation (3.16) Is the dispersion relation for the system (3.1,3;3), 
relating the temporal frequencies X to the spatial frequencies 
For C ■ 0, the roots of Eq. (3.16) are 

Xo(0) - 0 , Xii(O) - ±f . (3.17a) 

Oscillations with frequency f ■ 2n sin 0 , known as inertial 
oscillations , are quite rapid outside the tropics. The period of 
inertial oscillations at 0 ■ 0q, l.e. , for f ■ lO'^sec"^, Is Zv/fS 17.5 
hr. Pure Inertial oscillations are rare In the atmosphere (Dutton, 
1976, Sec. 9.3). The root Xq ■ 0 Is the slow-wave elgenfrequency for 
C « 0. 

The eigenvectors corresponding to the eigenvalues Xj^(O) are 

ao(0) “ (0,0, l)*^ , 3±i(0) « (l,ll,U)’^. (3.17b) 

Since w(0,u) is the average value of w(x,t), the expression for ao(0) 
means that all slow solutions of the system (3. 1,3. 3), as well as the 
corresponding initial data w(x, 0 ) c R^, must have u and v components 
with average value zero. Equations (3.17) correspond to the 
spatlally-independent solutions 
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u(t) - Uqcos ft + vosln ftj (3.18a) 

v(t) - VocQS ft - uo#ln ft» <3.18h) 

Jj 

♦ ■ ♦o U(-UQ + woct»® ft + vosln ft), (3.18c) 

which arise from constant Initial data (uq,vq,^q). 

For ^ > 0, It Is natural to e^q^ress the roots of Eq. (3.16) as 

V i 

phase speeds . Referring to Eq. (3.13), the phase speeds c^(C) a'^i^e 
related to the elgenfrequeucles by 

CjjCO - -^Xji(0 ; (3.19) 

relationship (3«14) will be demonstrated In the equivalent form 

MOI « |c±i(0|. (3.20) 

The roots of the cubic equation 

y3 « ay + b ■ 0 (3.21a) 

are given by 


yg, " 7 'J [(l”'**l) -y “ -y cos ^(“ ^ y ' 2 ' ) 

& 


(3.21b) 


The roots of Dq. (3.16), expressed as phase speeds, are therefore 


given by 
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C 4 <C) - U - y< 5) coi [ (i+1) y co«"^(-e(5))] , (3.22a) 


where 


Y(5) -^^( 52 * +f2)l/2^ (3.22b) 

e(5) -^f 2 eu( 52 * +f2)-3/2, (3.22c) 

The eigenfrequencies are real since + f 2 >0. 

Equation (3.22a) can be approximated In such a way as to make the 
magnitudes of the phase speeds more transparent. The dimensionless 
quantity e(5) Is small, and approaches zero quadratlcally as 5 ±». 

For the choice of parameters f, U, t and L given In Sec. 3.1, e(5(w)) 
", 0.115 for wave number o) * 1, e " 0.074 for u) ■ 2, e 0.043 for U) ■ 

3, and e ■ 0.007 for U) ■ 8. Since 

cos'k-^) "^ + 6 -«- 0 (e^), 

a good approximation to Cjj Is given by 

Cg, (C) = U - Y (O cos [ a+l) ^ ^ ^ e (O] . (3.23) 

Expanding the cosine to first order In a Taylor series about (A+l) ^ - 

j gives 
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co(0 = U-^Y(Oe(5). (3.24a) 

c±i(0 2 U ± +^y( 5) c<5); (3.24b) 


substituting Eqs. (3.22b»c) Into Eqs. (3.24) gives 


co«) 




(3.25a) 


c+i(e) = 


U ± 




1 


(3.25b) 


For the aforementioned choice of parameters f , U, L» and for 
the first eight wave numbers, the exact values of the phase speeds, 
Eqs. (3.22), are presented in Table 1. The corresponding approximate 
phase speeds, Eqs. (3.25), are presented in Table 2. Comparison of the 
tables Indicates excellent agreement between the approximate and exact 
values, and both tables verify the separation of phase speeds (3.20). 

Individual wave components of slow solutions of the differential 
equations have the relatively small phase speeds Cq(^ ) and are called 
slow waves , or Rossby waves . They are an Important feature of 
midlatitude atmospheric dynamics. The slow waves retrogress: as 
Indicated by Eq. (3.25a), their propagation relative to the mean 
current U is westward. The slow wave phase speeds are comparable to U, 
and increase monotonlcally toward U as the tfave number Increases. 
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The fast waves , with the large phase speeds are known as 
inertia-gravity waves since they are the usual gravity waves of 
shallow-water theory, for which c^j ■ U ± /♦, modified by the presence 
of the Coriolis force. The speed of the inertia-gravity waves is 
dominated by the second term in Eq. (3.25b); waves with speed Cj^ 
propagate toward the east and waves with speed c_j^ propagate toward the 
west. The inertia-gravity wave phase speeds decrease monotonically in 
magnitude, toward |U ± /$|, as the wave number increases. 

As explained in Sec. 1.2, Bossby waves and inertia-gravity waves 
are both present in slow solutions of the fully nonlinear , 
primitive-equation models actually used in NWP! a small inertia-gravity 
wave component maintains the quasigeos trophic equillbriun. Slew 
solutions of our linear shallow-water equations model, produced by 
Initial data in the slow-wave subspace consist entirely of Rossby 
waves. 

3.6. Approximate Slow Initial Data 

Having determined the slow-wave eigenvalues Xq(^), the slow-wave 
eigenvectors upon which the definition (3.15) of the slow-Wave 
subspace depends, are obtained by solving Eq. (3.11a) with t •* 0<, It 
is found that 

3o«) -[y«). +j^)l^ (3.2fa) 


for each 5 , where 


(3.26b) 


M(0 
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cos ^ cos”k-c(5))] . 


and y(?), c<5) are given by Bqs. (3.22b,c). 

Introducing the same approximations into Eq. (3.26b) as those 
that lead to Eqs. (3.24) results in 


m(5) 


g2*+f2 


whence Eq. (3.26a) becomes 


(3.27) 


ao<5 ) = [ 


1C 

C^*+f2 * 



(3.28) 


Eq. (3.28) is exact for g ■ 0; it reduces to the expression for 
in Eq. (3.17b). 

If w(x,0) eRg, then u(x,0) and v(x,0) are determined by (j>(x,0). 
Eq . (3.28), with Eqs. (3.6,3.7,3.15), implies that if ^(x,0) is 
specified arbitrarily. 


(j>(x,0) - I ao(C) 

C 


(3.29a) 


then approximate formulas for u(x,0) and v(x,0) such that w(x,t) 
evolves slowly are 


y 

g g2*+f2 


ao(C) 


u(x,0) 


(3.29b) 
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v(x,0) S ■ y>x^x,0), (3.29c) 

Fornulas (3.29b,c) are exact to first order in the small parameter 
e(C). 

Geostrophlc states of our linear system (3.1) are those for which 
u ■ 0 , V ■ ; (3.30a,b) 

the u-component of a geostrophlc state is zero since there is no 
pressure-gradient term to balance the Coriolis term in Eq. (3.1b). 
Equivalently to Eqs. (3.30), the Fourier components of a geostrophlc 
state are 

wCO - [0, i5/f , 1]’^ . (3.31) 

Comparing Eq . (3.31) with Eq . (3.26a) » it follows that slowly 

evolving states of our linear system are not geostrophlc. 

However, comparing Eqs. (3.30) with Eqs. (3.29b,c), we see that 
slowly evolving states are quasi geos trophic. For the v-components , 
Eqs. (3. 29c, 3.30b) , this is readily apparent. As for the 
u-coraponents, a numerical calculation shows that is the dominant 
terra in the denominator of Eq. (3.29b), except for wave number (u » 1, 

for which and f^ are roughly equal. It follows that u(x,0) and 

^(x,0) - 00 ( 0 ), given by Eqs. (3. 29b, a) respectively, are 

approximately proportional, with constant of proportionality U/*. The 
amplitude ^0 of the perturbation geopotential ^(x, 0 ) is typically 
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•malleT than the nean geopotential height 6 by an order o£ aagnltude. 
The amplitude Uq of u(x,0), 

“0 ~ ^ I^O “ I y ♦© » 

la therefore smaller than the mean zonal current U by at least an order 
of magnitude: slowly evolving solutions have small u**components • 

Hence, slowly evolving solutions are quaslgeos trophic. 

The u^component of solutions of the linear system (3*1) Is 
special. Its magnitude In our assimilation experiments will provide 
one convenient check of the proximity of state estimates to the 
slow-wave subspace. 
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CHAPTER POUR 

SLOW SOLUTIONS OF DISCRETE LINBAR SHALLOW-WATER EQUATIONS 

We Introduce now the discretization f of the linear shallow-water 
equations which will be used later In the assimilation experiments; 
here we *^ontulate the discrete model's slow-wave subspace R . The 
discrete slow-wave subspace Is defined directly in terms of Y, rather 
than In terms of the original differential equations or their slow-wave 
subspace . In particular^ R will have the property of being an 
invariant subspace of Y , and this property will be Important In our 
formulation of the modified KB filter. 

^•1» The Discrete Equations 

To discretize the differential equations, we use the Rlchtrayer 
two-step formulation of the Lax-Wendroff scheme (Rlchtrayer and Morton, 


1967, Sec. 

12.7 a nd 

13. A). Reasons 

for 

the 

suitability of this 

particular 

scheme 

to discretization 

of 

the 

linear shallow-water 


equations appear In Ghll 6j^^. (1981, Sec. 3.2). The scheme Is 

second-order accurate In time and space, and fourth-order dissipative 
In the sense of Krelss (Rlchtnyer and Morton, 1967, Sec. 5. A). 

The finite-difference grid 

tjj ■ kAt, k» 0,1,2,..*, (A.la) 

xJ - jAx, J - -^+1, -y+2, ..., y, (A. lb) 


Is Introduced, Where 
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Ax ■ L /M f 


(4.1c) 


and the number of grid points M Is assumed to be even. The 3-vector 




(4. Id) 


will approximate the exact solution w(jj^x.kAt). 

Rewriting the original system (3.1) In matrix notation, 

2t “ Cwx + Bw , 


(4.2a) 


where 


C » - 


B - - 


U 0 1 

QUO 
♦ 0 U 

0 -f 0 

f 0 0 

0 -fu 0 J 


(4.2b) 


(4.2c) 


the difference scheme is written, for k ■ 1,2,3,..., as 

2i! - H^-1 + (4t) Wtl^-1/2 

- 2^-1 + II C(wjtl^5 - stlfl) ^ T" 


ORIGINAL PAGE IS 
.59, OF POOR QUALITY 

for j - ’'CM/2)-fi, • nhoro the Internedlate valuee ere 

given by 

^4 stIiSP' 

- (I + ^ S^il) + TC 

for j ■ -M/2,.. M/2 ; the scheme Is closed by defining 

2iS{^ ■ si?^5 . sca*' - 

which corresponds to the periodic boundary condition (3.3). 

By combining Eqs. (A.3a,b), the scheme may be rewritten as 

1 

2^ - I 'I't 

for J ■ -(M/2)+l, . . . ,M/2, and k ■ 1,2,3,..., where 

2iS{^ “ ’sic-1 » (A.Ab,c) 

and the 3x3 matrices are given by 

4»o - I - o2c2 +^B), (A. 5a) 

’*'±1 - ± ^ C + ^ c2 (CB + BC) B(I + ^ B), (A. 5b) 


where 
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(4.5q) 

Finally, to write the difference echeme in the notation of Chapter 
2, we Introduce the composite vector 

& - ; (4.6) 

wjj is an n-vector, with n ■ 3M, which is composed of H 3-^ectotJ« 
Equations (A«4) can then be written as 

Sk " Hk-1 » 

vihere t is the 3MxM block-circulant matrix with 3x3 blocks (Davis, 
1979, Sec. 5.6) denoted by 

'J' ■ clrc » (4.7b) 

the individual blocks 'i'o*4'±| are given by Eqs. (4.5). 

A 3Mx3M block-circulant matrix with 3x3 blocks is a matrix that, 
when partitioned regularly Into 3x3 blocks, has M arbitrary blocks 
across its first row, with each of the succeeding M-1 rows obtained by 
circularly shifting the previous row one block to the right. It is 
denoted by listing the blocks Which appear across the first row, as in 
Eq. (4.7b). 

Thus, our dynamics mtrix f has only three nonzero blocks in each 
row. In fact, Y is almost block-tridlagonal: the only nonzero blocks 
away from the diagonal are in the upper right and lower left corners of 


ommmi PMSi ^ 

OF POOR QUAUltY 


0 ■ 


At 

Ax 
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Y» which contain the blocke T_j and raapactlvelyi aa a teault of 
the periodic boundary conditions. The matrix la Independent of the 
time kAt» unlike the general dynarolca matrix of Chapter 2. 

Equatlona (4.7), with the aubmatrlcea ¥q , flven by Eqa. 

(4.5) » conatltute the discrete "forecast model" upon which the 
assimilation experiments of Chapters 6 and 7 will be based. The 
remainder of this chapter Is devoted to formulation of the discrete 
model's slow’-wave subspace. The development parallels that of Chapter 
3: the discrete Fourier transform is introduced first , then Wj^ Is 

written In terms of the Fourier coefficients of -arbitrary Initial data 
Wq , end finally the slow~wave subspace Is defined. 

4.2. The Discrete Fourier Transform 
An M-vector u is now denoted by 

u - iu(-i^+l) ,..., u(^))'T ; (4.8a) 

for consistency with Sec. 3.2. spatial Indices appear as arguments 
Instead of as superscripts. The discrete Fourier transform of u is the 
M-vector 

U-(0(-|+l) i(5))T, (4.6b) 

whose components u(u>) are defined by 

u(oj) g-2irljw/M ^ 


(4.9a) 


r 






w 
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for fat - -(M/2Hlp...»M/2. If u(fa>) it givtn by Iq. (4.9t)» thtn u(J) 
can b* rtcovtrtd by tht invrtion foptula 


uO) - ^ I Ku), 


(4.9b) 


fa) 


for j - -(M/2)4‘I,..*,M/2« In Bqt. (A. 9) and in tha saquali the 
tymbolB ][j end always refer to tumoatlon over the index set 
{'- y +1,..., . Formilat (A. 9) are discrete analogues of Bqs. 

(3.4,3. 6). 

Equations (4.9) can be written more compactly in matrix notation. 
The MxM Fourier matrix (e.g., Davis, 1979, Sec. 2.5) is the matrix 
whose (t ,m)^^ element is 


l-2tti(p- ^)(q- y)/M]. (4.10) 

The Fourier matrix is symmetric and unttarv t l.e., 

* (4. lla,b) 

where the asterisk indicates the complex conjugate transpose. From 
definitions (4.8, 4.10), it follows directly that Eq. (4.9a) can be 
written as 

u - F„ u , (4.12a) 

while Eq. (4.11b) then implies that Eq, (4.9b) becomes 






L 
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* * 

U ■ Fm u* 

M* pi m 

For an n-vector conposed of N 3"vactor8y 

W - (!!''(- 5+0 s''<5)l*> 

where 

w(j) - lu(j)» v(j), ♦(j)]'^. 

the Fourier transforiii Is defined consistently with Eqs* (4 
is, w is the n-vector 

*w. ' 

1 - is’^(-5+i) w’^(5)i’^. 

whose component 3-vectors 

w(uj) » [u(<d), v(u), 


are given by 


Wto) e-Z'O”/*' w(J), 


for u» - “(M/2)+l, . . . »M/2. The inversion formula is 


<4, 12b) 

(4.13a) 

(4.13b) 

. 9) . IRiat 

(4.14a) 

(4.14b) 

(4.15a) 


»(J) . I J 1(0,), 


(4.15b) 
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for J - -(M/2)+l,...,M/2. 

In order to write Eqs. (4* 15) In matrix notation, we define the 
nxn permutation matrix V aa that matrix which reorders the elements of 
2f, Eqs. (4.13), according to 

^ (4.16) 

where the M-vector u is defined in Eq. (4.8a), and v and ^ are defined 
similarly; V is real and unitary, so that 

vT V* - V”1 . (4.17) 


It follows from Eqs. (4.11,4.17) that the nxn matrix F, defined 


by 


F - V 


-1 


Fm 0 0 

0 F„ 0 


OOF, 


M- 


V , 


(4.18) 


is symmetric and unitary. 


f’F - F , F* - F’^ 


(4.19a,b) 


Equations (4. 12a, 4. 16, 4.18) imply that Eq. (4.15a) can be written as 

w •« Fw , (4.20a) 

which, with Eq. (4.19b), implies that Eq. (4«l5b) becomes 
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w - F*Ws ^ (A. 20b) 

The n-vectors w and w given by Eqa.(4.20) constitute a discrete Fourier 
transform pair. 

4.3. Solution of the Initial-Value Problem 

The Iterates Wj^ of the finite-difference scheme (4.7) can now be / 
expressed In terms of the Fourier components of arbitrary Initial data 
Wq. It follows from Eqs. (4.20) that Eq. (4.7a) Is etjiulvaleht to 

Wk - ? Wk_x , <4.21) 

where the nxn matrix T Is given by 

$ - FVF* . (4.22) 

Due tdii the block-clrculant structure of ¥ , the matrix F is 
block-diagonal ; the matrix F block-diagonallzes all 3Mx3M 
block-clrculant matrices having 3x3 blocks (Davis, 1979, Thm. 5.6.4). 

The matrix $ has M 3x3 blocks, denoted by ¥((i>), along Its main 
diagonal, and zeros elsewhere. It Is denoted by 

y - dlagl$(- ” +!),$(- 1+2),.. .,F(^], (4.23) 

and the Individual blocks are given by 

$(a») - I ^ 

J-1 


(4.24) 
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for u - "(M/2)+l,...,M/2; the matrices f j are defined by Eqs. (A.5)» 
The matrix f(w) Is called the symbol, or amplification matrix , of the 
difference scheme (e.g,, Isaacson and Keller, 1966, Sec* 9.5). 

Since T Is blodc“dl agonal, the n-vector equation (4.21) decouples 
Into M 3"vector equations. The decoupled equations are written. In the 
notation of Eqs* (4*14), as 


Wij(w) - f (w) $K-i(w). 


(4.25) 


f or (t> ■ “ y +1 , < 


M. 

'*T 


therefore, by repeated application of ¥((•>), 


^(o)) - $*^(a>) Wq(w). (4.26) 

Eqs* (4.25,4.26) correspond to the fact that, for constant-coefficient 
linear difference (or differential) equations, waves with different 
wave numbers evolve separately. 

Equation (4.26), like Its continuous counterpart (3.10), Is 
slnqtllfled by an appropriate eigenvector expansion. Let 6j^(b)) be the 
eigenvalues of $'(w), with corresponding eigenvectors £)j(u); 

^(w) r^((o) - 64 ( 0 )) rji(m), (4.27) 

for 1 « 0,11. The eigenvalues are generally complex, and we write them 
In polar form, as 


6jj((o) - pg(w) e 


lVji(w)At 


(4.28) 
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r ,-.s: 






where Pj(w) end Vj^((t>) ere reel* The metrlx t'^(w) In Eq. (4.26) hes 
the eene eigenvectors ee ¥(w), but with eigenvalues 6]^(u): 


T*^(w) r^(w) - 6j^(o)) rjt(u). 


(4.29) 


It will be shown In Sec. 4.4 that the triplet of eigenvectors 
corresponding to each o) Is a linearly independent set. Hence, the 
Fourier transform of the initial data can be expanded as 


Wo<(iJ) - I 6 £(w) r^Cw), (4.30) 

4 

for some scalars 6 From Eqs. (4.28-4.30) It follows that Eq. 
(4.26) can be written as 

Wj^(w) - I Pj^Cw) iiiui) pj(w) (4.31) 

I 

Finally, using Eq. (4.31) In the Inversion fomula (4.15b), It follows 
that W| 5 ^(J) can be written as 

I P 4 (<*») exp{ll5(u))x(J) + Vjj(w)ti^l} , (4.32) 

(I) 4 

where C(w) ® 2n(D/Mex and xCj) ■ jAx. 

Equation (4.32) is the Fourier series solution of the 
Inltlal-vaiue problem for the discrete system (4,7), The quantities 

(S£,rj^,V 4 ) in Eq. (4,32) have counterparts (otji*S4»l4) 1" solution 

(3.13) of the continuous equations. The factors P£((i>) are due to 


discretization. 
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4»4. The ElRfttifrequfeiiclcg and Phase Speeds 

According to Eq. (4*32) , the elgenfrequencles associated with the 
difference scheme are the quantities v^(u)). We now show that they 
separate naturally Into low and high frequencies* like the 
elgenfrequencles of the differential equations* and then we define the 
discrete slow-wave subspace. 

To classify the elgenfrequencles* the eigenvalues were 

computed mnerlcally* cf. Eqs. (4.27 4.24*4.5)* after Which the 
elgenfrequencles were obtained by use of Eq. (4.28). As In the 
assimilation experiments of Chapters 6 and 7* the values M ■ 16* At » 
30 min. * were used In the eigenvalue computation. 

The elgenfrequencles corresponding to m > 0 are 

Vq( 0) - 0 , v±i(0) - + 1.0053 f * (4.33) 

In close agreement with the corresponding result (3.17a) for the 
continuous system. 

For wave numbers a> ■ 1*2 *...8* the phase speeds 

V »((!)) 

are presented In Table 3. Comparison with Tables 1 and 2 shows that the 
phase speeds associated with the difference scheme are good 
approximate iS to those of the differential equations only for the 
smallest one or two wave numbers. This behavior Is typical of 
dissipative difference schemes* such as the Rlchtnorer scheme* although 
the discrepancy beween discrete and continuous phase speeds Is somewhat 
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•xaggerated by our relatively coarse nesh: Ax L/16 ■ 875 km«* which 
is nuch larger than mesh spacings typically used in IIUP. The 
discrepancy is an indication of the difference between the discrete and 
continuous slow-wave subspaces. 

Except for <0 " 8, the table does show that the phase speeds, and 
hence the eigenfrequencies, are well-separated: 

|vQ((tt)| « |v+i(w'.|i (4.35) 

for u) ■ The same is true for (■> “ -l,...,-7, since Eq. (4.24) 

implies ^(-u)) ■'i'(aj), whence 6j^(-to) ■ yjj(w) and |vjj(-to)| ■ |vj^((i})|. 

Foru)>M/2 aS, all phase speeds and eigenf requencies are zero: 
Waves with the highest spatial frequency are stationary. All three of 
these waves could be associated with the discrete slow-wave subspace, 
but we select only one, as follows. According to Eqs. (4.5,4.24), 
$(M/2) is given ly 

y(”) - 4»q - - I - (4.36) 

where C is given by Eq. (4.2b). Since the eigenvalues of C are U, U ± 
/♦, the eigenvalues of $ (M/2) are given by 

6o(y) “ 1 - 2 o2u 2, (4.37a) 

6+l(^) - 1 - 2 o2(u ± ✓*)2; (4.37b,c) 

the eigenvalues 6o,±l(^/2) are distinct, although the corresponding 
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frcquencles ^Q^ti(M/2) are al,l zero* Due to the appearance of the 
factors U and U ± /♦ in Eqs. (4.37), and according to the dlacuaslon 
following Eqs. (3.25), It la appropriate to refer to 6 q(M/ 2) aa the 
Roaaby eigenvalue for wave number M/2. 

According to Eqs. (4.33,4.37) end Table 3, the eigenvalues 6^(u) 
are distinct for each b>. The corresponding eigenvectors rj^(u>) are 
therefore linearly Independent, and expansion (4.30) Is valid. 

4.5. The Discrete Slow-Wave Subspace 

It has been shown that the elgenfrequencles Vj^(a)), defined in Eq. 
(4,28) by the eigenvalues 6jj(a)), can be classified into Rossby 
frequencies Vg(u)) and inertia-gravity frequencies v+i(o)), for each 
possible wave number u. The corresponding Rossby eigenvectors 
and inertia-gravity eigenvectors tfj(o)) are defined by Eq. (4,27). 

According to Eqs. (4.30,4.32), if the initial vector Wq has 
Fourier components solely along the Rossby eigenvectors, then the 
corresponding solution w^^ of the difference scheme (4.7) evolves 
slowly, with frequencies Vq(o)), That is, analogously to the definition 
(3.15) of the continuous slow-wave subspace, the discrete slow-wave 
subspace is the set R, given by 

R “{we R*'! w(w) “ Pq(o)) ro(“) 

M. Ml 

scalars eQ(o)) and for all (O “ -y +1,..., y }» (4.38) 


where R” denotes the set of real n-vectors. 
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We give two further definltlone of R , equlvelent to (4*38)# 
Introducing the n~vector 

5^ • lQ,...,0,rJ(u),0,...,0l'^, (4.39) 

M M 

where there are 3((tf+y ->l) zeros on the left and 3(y ■* u) seros on the 
right, an equivalent definition is 

R ■ {we if "I ®°"*' ecalars $q((i))} ; (4.40) 

b) 

definition (4.14a) has been used. It follows from the Fourier 
transform pair (4.20) that another equivalent definition is 

R “ {w e R”: w * scalars pQ(b>)} . (4.41) 

b) 

Definition (4.41) makes it clear that R is a subspace of R". lhat is, 
R is a nonempty subset of R", and 

“iZl “2X2 ® Xl ® R X2 ® ^ » (4.42) 

for all real scalars In fact, R is an M ■ n /3-dimensional 

subspace of R“. 

The n-vector F*s^ represents a pure wave of wave number b>. It is 
also, for each (o, an eigenvector of ¥ with eigenvalue PQ(b)): 

¥f\ - F*¥Ff\ , 

- » 


from (4. 19b, 4. 22) 
from (4.19b) 


from (4.23,4.27,4.39) 


(4.43) 
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According to definition (4.41) , R ii therefore en inve riant lubepiice 
of f: 

tX c R if X e R • (4.44) 

This fact also follows Immediately from Eqs. (4.30,4.32): if the 
initial data Wq e R, then W|^ c R for all k. 

Analogously to Eq. (4.38), the discrete fast-wave subspace 6 , 
consisting of inertia -gravity waves, is defined as 

* 

6 - {w e w(w) ■ P-i(w) E-i(w) + 0i<w) ri(u) 

M M 

for some scalars 8±i(w) a:id for all w ■ “ -y +1*.**» Y } 

G is a 2H-dlmen8lonal Invariant subspace of *f. Taken together, R and G 


span all of R", 
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CHAPTER FIVE 

ESTIMATION THEORY AND INITIALIZATIOH 

1 

j 

5»1* Introduction 

Although the Kalman-Bucy filter poeseesds many optlnallty 
properties, It lacks one property of primary Importance In numerical 
weather prediction. Namely, there Is no guarantee that the state 
estimates produced by the KB filter evolve slowly: the KB filter does 
not solve the Initialization problem. In this chapter we Introduce a 
filter, or data assimilation scheme, which, automatically produces 
slowly evolving state estimates and which retains much of the 
optimality of the KB filter. The filter consists of a single 
modification to the usual KB gain matrices. He now summarize the 
results concerning the modified KB filter. 

The standard KB filter was derived in Chapter 2 by solving an 
unconstrained minimization problem: the quadratic error functional 

" El(wk - 2k>'^ A<2k ” 2k>l 

was minimized. In turn, at each time k 1,2,3,... . The modified KB 
filter is derived by solving a constrained minimization problem: again 
is mlnlml'^.ed with respect to the gain matrix K|^ , but now subject to 
the constraint that 

Range Kj^ C R, 

l.e., that each column of Kj^ lies in the discrete slow-wave subspace. 
As a result of the fact that the discrete slow-wave subspace is an 
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Invariant lubcpAce of V » and provided e R , we will aaa that 

■atlifactlon of thla constraint la neceaaary and sufficient for the 

state estlnates to evolve slowly. 

It was shown In Chapter 2 that the K& filter nlniiBlKes for all 
choices of the positive semldeflnlte weighting matrix A. Constraining 
the state estimates to evolve slowly results In a trade-off: the 
modified filter depends on the choice of A. When using the modified 

flltert one imist actually choose the error functional to be minimized. 

Provided the class of weighting matrices A Is suitably restricted ^ 
the constrained minimization problem has a unique solution • and the 

modified KB filter Is uniquely determined. It Is given by multiplying 
the usual KB gain matrices by a projection matrix II which depends on A* 
n ■ n(A) ; n is the A-orthogonal projection matrix onto , and will be 
defined below. We denote the modified filter's gain matrices by 

kJ*® - nKj[®. 

The modified filter corresponds to following the standard KB filter 
with linear normal mode Initialization at each observation time. 

Linear normal mode Initialization, In Its non variational form, 
consists of setting to zero all fast components of the analysis vector, 
while leaving the slow components unchanged, Cf. Eqs. (1.3). For the 
modified filter, this Is accomplished by taking II to be the projection 
onto R along the fast-wave subspace ® . We refer to this projection as 
parallel projection . This projection Is A-orthogonal for an 
appropriate choice of the weighting matrix A. 

Other choices of A correspond to performing variational linear 
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nonutl Mode Initialisation! th« fast conpontnta ace atill aet to seco» 
but the slow components are altered alaOf cf* Eqa* (1.7, 1,8). One 
such choice la A * in which case n la the usual orthogonal 
projection onto As a result of the fact that the slow-wave subspace 
R is not orthogonal to the fast-wave subspace 6 ^ we will see that the 
orthogonal projection is not the sane as the parallel projection, so 
that the corresponding filters produce different results. This is the 
general situation! it is not particular to the continuous model (3,1) 
or to its discretisation (4.7). Cases in which G and R are 
orthogonal are very special. 

The choice A - I is not appropriate for our model, since it 
corresponds to minimizing a sum of squares of dimensionally 

inconsistent quantities. We introduce therefore an additional 
projection, the mini mum-energy projection , in which A is chosen as the 
diagonal matrix which makes the expected energy of the analysis 

error. 

After relevant material on projections is discussed in Sec. 5.2, 
the modified filter is formulated in Sec. 5.3, It is shown in Sec. 
5.4 how to efficiently compute A-orthogon?'* projections onto R , for 
general classes of weighting matrices A. 'I'lie parallel, orthogonal, and 
mininum-energy projections are discussed in Sec. 5.5. 

Some of the results of this chapter are stated as lemmas and 
theorems. These are all proven in the Appendix. 

5.2, Projection Matrices 

Let S be a subspace of R*^. That is, S is a nonempty subset of 


R" and 


7^ 


*l5l ®252 ® ^ Si ® ^ S2 ® » 

for all real scalars ai>02» Hotice that S mist contain at least the 
zero vector t 

An wn matrix n Is called a projection ■atrlx onto S, or siniply a 
projection, If It has the properties 

Range n • S , (5.1a) 

n2 - n ; ' (5.ib) 

the range of a nacrlx is the set of linear combinations of Its columns, 

l.e. , 

Range n ■ {x: x ■ njr for some y e r”} • 

If n is a projection onto S and x ■ Jly, we refer to the vector x as a 
projection of jr onto S . 

A subspace has a simple characterization in terms of projections 
onto It. Suppose H Is a projection matrix onto S. If x is a vector 
in S, then 

X ■ Ily , for some y e R” , by Eq. (5), la) , 

- n\ , by Eq. (5.1b) , 

■ Hx , since ny ■ x ; 

l.e., X « nx. On the other hand, if x c R” and x ■ IIx, than Eq, 
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(5. la) Implies that x e S . That ia^ If II la a projection onto we 
can write 

S - {x e R": Hx - x} , (5.2) 

Equation (5.2) states that a projection matrix acts on Its range 
like the Identity matrix. To completely characterize a projection 
matrlXt It remains to specify how It acts on the rest of R”. One way 
to do so Is as follows. 

Let V be a fixed » but arbitrary, positive semldeflnlte nxn matrix. 
The kernel , or null space, of V is the set of its null vectors, 

Ker V - {x e R”: “ 0} . 

Since V is positive semldeflnlte, rather than positive definite, Ker V 
may contain vectors other tlian the zero vector. 

Two vectors Xl»22 ^ said to be orthogonal If 2,lZ2 “ 

More generally. If 2l'^i2 “ then the vectors are said to be 

V-orthogonal . In particular, and ^2 V-orthogonal If e Ker V 

or 22 e Ker V, 

Suppose one can find a projection matrix II which, in addition to 
satisfying Eqs. (5.1), satisfies 

(vn)’’^ - vn . (5.3) 

It follows that. If 2 Is any vector In R”, then 


x^V(2-Il2) ■ 0 
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fot all keS . That la, the vector which is the vector 

difference between 2 and Its projection onto S , la V-orthogonal to 
every vector x e S • Indeedi fromEq. (5.2) we have IIx ■ x« whence 

x'^'vCjf-nx) - (irx)%<x-nx) 

- x’^(vti)'f(x-n2) 

- x‘*^vn(;^-nx) 

- 0 ; 

the last two equalities follow from Eqs. (5.3,5.1b), respectively. 

A project ion matrix onto S is therefore called V-^-orthogonal if, 
in addition to satisfying Eqs. (5.1), it satisfies Eq. (5.3). In the 
special case in which V is actually positive definite, it is well-known 
that Eqs. (5.1,5«,3) define a unique matrix IT (e.g., Halmos, 1958, Sec. 
75), i.e, , Eq. (5.3) serves to characterize the projection matrix 
(5.1). Eor reasons which will soon be msde clear, we allow V to be 
semldefinite . One can still find a V-orthogonal projection matrix onto 
S in this case, and there is a simple necessary and sufficient 
condition under which the projection matrix is determined uniquely. 

Lemma 1 . Let S be a subspace of R” and let V be a positive 
semldefinite rtxn matrix. Then there exists a V-orthogonal projection 
matrix onto S . 

If S*= {0} or if S» R”, it is clear that there exists exactly one 
V-orthogonal projection matrix onto S, regardless of the choice of V. 
In the former case it follows from Eq. (5.1a) that II “ 0, while in the 
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liltter c^ise n * I since n nust act like the identity on its range, 
which In this case Is all of In both cases Eqs. (3, 1, 5, 3) are 
satisfied. We therefore state the uniqueness criterion only for proper 
subspaces S , l.e., for subspaces of other than (0) and R" Itself. 

Lemma 2 . Let S be a proper subspace of R*^ and let V be a positive 
semldeflnite uxn matrix. There exists a unique V-orthogonal projection 
natrlx onto S If and only If 

S n Ker V - {0} . ’ (5*4) 

The symbol n Indicates set intersection. Lemma 2 states that 
for exactly one matrix II to exist which satisfies Eqs. (5.1, 5.3), It 
is necessary and sufficient that S and the kernel of V have only the 
zero vector in common. 

The uniqueness condition (5.4) la satisfied. In particular, if V 
is actually positive definite, for then V Is nonsingular and Ker V « 
{0}, The I-orthogonal projection matrix onto S, known simply as the 
orthogonal projection onto S, Is therefore unique. 

We have already seen that if n is a V-orthogonal projection matrix 
onto S and if jr e R”, then the vector X-Hy Is V-orthogonal to every 
vector In S. The following lemma states that, provided the uniqueness 
condition (5.4) Is satisfied, there Is a vector In S which is 
"closest" to jr, and In fact the "closest" vector Is lly. 


I,emtna 3, liet S be a proper subispace of R”» let V be a positive 
senldeflnlte nxn io3trlx» and let en arbitrary vector e R” be given* 
There exists a «jnlque solution x of the problem 

minimize (x - V(x - j;) , (5.5a) 

subject to X e S , (5»5b) 

If and only If S and V are such that Eq. (5.4) is satisfied. In which 
case the solution is 

X - n X f (5.6) 

where H is the unique V-orthogonal projection onto S. 

Suppose now tliat V is actually positive definite. In this case, 
there is a simple formula for the V-orthogonal projection onto S, in 
terms of the (I~) orthogonal projection onto S and in terms of the 
square root of V. 

For every positive definite natrlx V, there is a unique positive 
definite matrix B such that > V. This matrix is called the 

(positive) square root Of V, and we denote it by the inverse 

(positive) square root of V, defined by 

= (yl/2)-l^ 


is also positive definite. 

The square root of V can be constructed as follows. Since V is 
symmetric, V can be diagonalized by an orthogonal matrix U, 
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V w UDU^ , UTU - UUT . I. (5.7a.b) 

Here U Is an nxn matrix whose columns ere the normalized eigenvectors 
of V, and D is a diagonal matrix whose diagonal elements are the 
eigenvalues of V; > 0 sinoe V is positive definite* The square root 
of V is then given by 


vl/2 « u pl/2uT , (5.7c) 

where dI/2 

is the diagonal matrix whose diagonal elements, , are 

the positive square roots of the eigenvalues ef V, Vfe have also 

y-lfl « u p-l/2uT^ (5.7d) 

where 

Lemma 4 . Let S be a subspace of and let V be a positive 
definite nxn matrix. Denote by Hy the V-orthogonal projection onto S 
and denote by JIj the orthogonal projection onto 5. Then 

n, . v-i/2 Hi v'/2. (5.«) 

Lemma A has a simple generalization. Applying the lemma to 
another positive definite matrix W, we have 


f 


82- 




or 


Hi - n^ , 

which, upon substitution Into Eq. (5.8), gives 

Hy - ^,-1/2 yl/2, (5.9) 

Thus, two projections based on positive definite matrices can always be 
expressed in terms of one another. 

In case V is positive definite, then 

5*Z ® 

defines an inner p roduct on R”, with corresponding norm Ixly , 

IxHy = (x*x)v “ X e R” ; 

Ixly >_ 0 and I xH y “ 0 if and only if x “ 0. Equation (5.3) is 
equivalent to requiring a matrix II to be symmetric with respect to this 
inner product, l.e. , 

(x»ll2)v “ 1S»2 E R" , 

which is the usual way of defining orthogonality of projection matrices 
on inner product spaces (e«g>, Halmos, 1958, Sec. 75). 

For positive semidefinlte matrices V, Ixly « 0 does not imply x ■ 
0, l.e., 1*1 Y is only a semlnorm on R". However, we note that the 
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unlqueness condition (5*A) Is necessary and sufficient for Mly to 
define a norm on the aubspace S ; see the proof of Lemma 1 in the 
Appendi)>% 

Lemmas I and 2 will be important in the derivation of the modified 
Kalman-Bucy filter. The arbitrary subspace S of the present section 
will be taken to be the slow-wave subspace R , and the matrix V will be 
the weighting mStpix A of the error functional p. ^ 

The uniqueness condition of Lemma 2 will turn out to be the 
condition for uniqueness of the modified KB filter, l.e., the class of 
positive semidefinlte matrices A satisfying 

R n Ker A = {0} 

will be the appropriate class of weighting matrices. We saw in Sec. 
2.4 that positive definiteness of A is the condition for uniqueness of 
the standard KB filter; positive definiteness of A is not necessary for 
uniqueness of the modified KB filter. However, the modified filter 
will depend on A, regardless of whether A is positive definite. Some 
natural weighting matrices , both semideflnite and definite, will be 
considered in Sec. 5.5. 

Lemma 3 serves as a prototype for Theorem 1 of Sec. 5.3, from 
which the modified filter follows, and it gives the modified filter an 
Interpretatlct in terms of normal mode initialization. Lemma 4 will be 
used in Sec. 5.4 to describe one way, among others, of computing the 


modified filter 
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5«3. The Modified Kalnan-Bucy Filter 

We begin by formulating a ilmple necessary and sufficient 
condition under which an asaimilation scheme for our discrete model ¥ , 
Eqs. (4.7), will yield slowly evolving state estimates* As usual, we 
consider only assimilation schemes which are linear and unbiased. 
According to Eqs. (2. 7, 2. 8), these schemes are of the form 


Sk - fSk-l . 

(5.10a) 

+ 

1 

(5.10b) 


for times k when observations are available; when no observations are 
available, Eq. (5.10b) is replaced by 

- wjE . (5.10c) 

The notation 

“ 2k “ *^k2k * 

for the observednninus-f orecast residual , has been Introduced in Eq. 
(5.10b). Ibe true states, w^ , and the observations, , are assumed 
to be given by stochastic-dynamic models (2.2, 2. 3), with ■ 4* given 
by Eq. (4.7b), 

The state estimates in Eqs. (5.10) will be said to evolve slowly 
if they always lie in the discrete slow-wave subspace, l.e. , If 

Wj^ e R and w^ e R for k*l,2,3,... . 


(5.12) 


..■0 
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An immediate consequence of the fact that R is an invariant subspace of 
¥ , Eqs* (A.A2,4.A4), is that (5.12) is satisfied if and only if 

w§ c R (5.13a) 

and 

Kj^wj^ c R at each observation time k. (5.13b) 

Condition (5.13a) says that the assind^latlon must start from an 
Initialized state, while (5.13b) is a condition on the "correction" 
vectors in Eq. (5.10b). 

To see that conditions (5.13) ltn>ly (5.12), notice that if 
e R , then 2^ g ^ since R is invariant under ¥ ; if k is not an 
observation time then we have w^ “ 2^ e R » while if k is an observation 
time and Kuw]^ e R then we still have w^ c R • since R is a subspace. 
Upon continuing the cycle, the implication is clear. On the other 
hand, if e R at some observation time k but Kj^w^ 4 R , then w^ ^ R 
since R is a subspace. Thus, (5.12) and (5,13) are equivalent. 

Conditions (5.13) are necessary and sufficient for the 
assimilation scheme to yield slowly evolving estimates for a particular 
realization of the state end observation processes, Eqs. (2.2a,2,3a). 
For the scheme to yield slowly evolving estimates for all realizations 
of the state and observation processes, it is necessary and sufficient 
that 

Wq e R (5.14a) 


and 
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■ - ...-.,...-.-.....,..4^ , ,, .... 

Range Kj^C R at each obaervetlon tlw k • (S.lAb) 

To see thals this is the case, recall from Sec. 2.2 that the gain 
matrices are supposed to be nonrandom, l,e. , they are supposed to be 
independent of Individual realizations of the state and observation 
processes. The residual is a random vector and, unless restrictions 
are placed on the system noise and observational noise, w|[ can take on 
any value in R". Therefore condition (S.lAb), which says that e R 
for all xe R” , is Just the statement that (5.13b) should hold for all 
realizations. In other words, the set of gain matrices which satisfy 
(S.lAb) is the set of gain matrices which are Independent of wj^ and 
which satisfy (S«.13b). 

If the initial estimate satisfies (S.lAa) and if the gain matrices 
satisfy (S.lAb), then the state estimates evolve slowly in bstween 
observation times, as well as after the final observation time k ■ N. 
That slow evolution is possible depends crucially, as We have seen, on 
the fact that R is an Invariant subspace of S'. This is why we work 
directly with the discrete slow-wave subspace R . Any other discrete 
approximation to the continuous slow-wave subspace R^ Will not have 
the property of being Invariant under 4* . 

The Kalman-Bucy filter generally does not yield slowly evolving 
state estimates, tinless it is assumed that the true state evolves 
slowly , 

w^ E R , f or k • 0, l,2,..u . 


(5.15a) 
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If (5.1Aa) Is satisfied* then (5.15a) Is equivalent to requiring the 
Initial estlnatlon error and the aystem noise to lie In R * 

H^"S5 ® ^ ® ^ k«0,i,2,... ; (5.15b) 

cf. Eq. (2.2a). If (5.15b) holds* then It follows from Eqs. 
(2.22b*c*d) and the definitions of the Initial estimation error 
covariance and the system noise covariance* Eqs. (2.9b*2.2c)* that the 
KB gain mtrlces satisfy (5.14b): the state estimates evolve slowly. 
See also Petersen (1973) for a similar result. 

We do not assume conditions (5.15a) or (5.15b) to hold: 

atmospheric states generally do have fast components. Instead* we seek 
an alternative to the KB filter* by imposing condition (5.l4b) as an 
explicit constraint on the minimization of the usual error functional 

"^k * 


'^k “ ^^-k“2k^] » (5.16) 

A Is a fixed* but arbitrary* nonrandom positive semldefinite nxn 
matrix. That Is, we seek gain nmtrlces Kj^ which 

minimize with respect to Kj^ * (5.17a) 

subject to Range Kj^ C R ^ (5,17b) 

at each successive observation tine k; we already know that K^ ■ 0 If 
there are no observations at time k. Notice the similarity between 
problem (5.17) and problem (5.5) of Lemma 3. The Solutions* and 
conditions for their uniqueness* are also similar. 
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Theorem 1» All eolutloni ot problem (5.17) ere given by 

■ Hjj ^ * (5.18) 


where Is any A-orthogonal projection matrix onto R, l.e.» any matrix 
such that 


Range II 

- R » 

(5.19a) 


- n , 

(5.19b) 

(An)''^ 

All 1 * 

(5.19c) 


and where is any nxp matrix such that 

Range Lj^ C R O Ker A ; (5.20) 


is the usual Kalraan-Bucy gain matrix, Eq. (2.20a). There exists a 
unique solution of problem (5.17) if and only if the weighting matrix A 
is such that 


R n Ker A ■ {0} ; 


(5.21) 


in caseEq. (5.21) holds, there exists exactly one A-orthogonal 
projection matrix onto R , denoted by II, and the unique solution of 
problem (5,17) is 


- kP 5 I kP . 


(5.22) 


isa » m 


•V: *■ -» * 
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The proof of Thcorc« 1» i^ich appoars In tha AppandiXi followa 
aaaily from Lennai 1 and 2 and from our darivafion of tha KB filtar In 
Sac* 2«4« 

Theorem I states that the constrained ninlndsatlon problem (5*17) 
uniquely determines a gain matrix sequence if and only if the error 
functional is based on a weighting matrix A which satisfies K ^Ker A ■ 
(0), in which case the gain matrlcea arc obtained by multiplying the 
usual KB gain matrices by the A-orthogonal projection onto R . The 
uniqueness condition is not very restrictive* For example, weighting 
mtrlces of interest are usually positive definite, rather than merely 
positive seraideflnlte; if A is positive definite, then it is 
nonsingular and Kep A ^ {0}, so that RH Ker A • {0} automatically* 

Essentially, weighting matrices satisfying the uniqueness condition are 
the appropriate ones for consideration. We discuss some of them, 
including some singular ones. In Sec* 5.5. 

Suppose, therefore, that A satisfies the uniqueness condition, Eq* 
(5*21). According to Eq 8* (5*22,2*20), the resulting gain matrices 

are given by 

(5* 23a) 

at observation times k, where n is the unique A-orthogonal projection 
onto R, and 


kJ’® - 0 


(5*23b) 


In the absence of observations at time k* We refer to the corresponding 
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ORiQINAL PAOE W 
OF POOR QUALITY 


data aaalnU.atlon achttM at the nodi fled Kalmn-Bucy filter * To 
•inmarite, it ia Riven in full by 


■ '•'sS-i > 

(5.24a) 

4 ■ + Vi . 

(5.24b) 

«k-*r. 

(5.24c) 

P5 - (I-KkHk)pH(I-Vk)'' + Mkl^' 

(5.24d) 

Sk " Sk *^^Sk “ **k-k^ * 

(5.24e) 


for k ■ l,2,3>..,{ cf. the atandard KB filter* Eqs. (2*22). Note 
that the general formula (2.10b) is used for the analysis error 
covariance matrix P^. 

The modified KB filter results in slowly evolving state estimates 
provided Initialization Is performed at the start of the asslmllat’lon, 
Bq. (5.14a). The filter Is optimal In the sense that slow evolution 
Is achieved simultaneously with the successive minimization of the 
error functionals rij^. Unlike the standard KB filter, however, the 
modified filter depends on the error functional's weighting matrix A, 
cf. Eqs. (5. 19c, S. 8). One mist therefore choose the error functional 
to be minimized. 

Lemma 3 offers a simple interpretation of the modified filter. 
Equation (5.24e) can be written 

Sk - Sk + in^“(»k - "kSk)- 

If Wq e R and the modified filter has been used up to time k, then 
w|^ c R . Therefore ■ H w|^, cf. Eq. (5.2), and we have 
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!«■"(!!{ + '4’<!!2 - - nxS i (5.25) 

here Is the analysle vector that would be produced by using the KB 
filter at tine k, cf. Eq. (2.22e). Thus, ^ is the A-orthogonal 
projection of onto R, 

In fact, according to Lemma 3 and Eq. (5.25), w^ Is that vector 
In R which Is closest to the KB analysis vector , In the sense that 

(sS - Xk>^ *<Sk - J!k) »-2«> 

Is minimized. In other words, the modified KB filter Is equivalent to 
following computation of the KB analysis vector with variational normal 
mode initialization: Is an "objective analysis" and w|^ Is the 

"Initialized" version of Sk ^ ^ (5.26) is minimum. 

Equation (5.2Ad), as compared with Eq. (2.22d), determines the effect 
on the analysis error of combining initialization and assimilation. 

Theorem 1 shows that (5.26) is not the only functional being 
minimized by use of the modified KB filter. The functional of the 
difference between the analysis vector w^ and the true state w^ is also 
being minimized. This stronger result obtains, in essence, because the 
assimilation part of our initialization-assimilation scheme is the 
standard KB filter. 

To conclude this section, we point out that although Theorem 1 is 
stated for our discrete model T and its slow-wave subspace R, the 
theorem is actually quite general. The proof of Theorem 1, and the 
discussion leading to the statement of the theorem, depends only on the 
fact that R is a proper invariant subspace of Y ; the actual definition 
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of R 1 b ittiaterlal. In particular^ the theorem holda for other 
discrete models t and their slow-wave aubspaces. In general , the 
theorem shows how to estimate the state of a stochastic-dynamic systemi 
given noisy observations, in case the state estimates are confined to 
an Invariant subspace of the systom^s dynamics* 

In the trivial case R ■ R*', Bq. (5.17b) presents no constraint 
and, as one would expect, the modified filter redu^^es to the standard 
KB filter. To see this we con still use the first part of Theorem 1, 
since only tlie uniqueness part of the proof depends on the fact that R 
is a proper subspace of R". 

If R" R'' then, regardless of the choice of A, there Is exactly 
one A-ort1v3gonal projection onto R , namely n “ 1} see the discussion 
following lemma 1. Equations (5.18,5.20) therefore become 

K^-kJ®+L^, (5.27a) 

Range C Ker A. (5.27b) 

This gives the unique formula gain matrix. If and only 

if Kes A - {0}, l.e. , iff A Is positive definite. Positive 
definiteness of A was the condition already found to be necessary and 
sufficient for uniqueness of the KB filter; see the discussion 
following Eq. (2.19). In fact, the general solution of Eq. (2*18) is 
given by Eqs. (5.27). 

5. A. Computation of Projections onto the Slow-Wave Subspace 


In order to actually carry out computation of the modified KB 
filter, one must be able to calculate the A-orthogonal projection 
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Mtrlx n, or at least to con|)ute IIw for arbitrary vectors w c R^. The 
SBiln result of this sectloiii Theorem 2, gives a formula for efficient 
conputatJon of n in case the weighting matrix A is block clrculant. We 
ahow that in thio case, the fast Fourier transform (FFT) can be ur'^ed to 
coq>ute Ilw in only 0(n log n) arithmetic operations* After discussing 
this case, we appl^y Lemma A to the more general case, in which A is not 
necessarily block circulant* 

Suppose that in the error functional 

n - E[ (w®-w^)'*^ A(w*-w*^) l , 

the weights are homogeneous in space, i.e. , the weights applied to 
variables situated around a given grid point are the same as the 
weights applied to the variables identically situated around each of 
the remaining grid points. Since our domain is periodic, and due to 
the ordering (4.6) of the components of the vectors ^ and w^, this 
means that the weighting matrix A is block circulant with 3x3 blocks, 

A ■ (Aq, A]^ , . . . A.^^2+1> * * * (5*28) 

cf. Eq, (4.7b). That is, with A partitioned regularly into 3x3 
blocks, the 3x3 Bubmatrlces Aq,Ai,...,%/ 2 ,A-m/ 2 .|. 1 .»*«*A _1 appear in 
order across the first row of A, and each of the remaining M-1 rows is 
obtained by circularly shifting the previous cow one block to the 
right. 

The submatrix Aj (A»j) gives the weights applied to the variables 
u,v,^ at the grid point located j intervals to the right (left) of a 
given grid point. In case one applies only local weights, which is 


-94- 


typlcal In practlcei one haa Aj ■ 0 for j 0 and A would be 

block-diagonal . Wltl< the natrlx Aq repeated along the main diagonal of 
A# 

In the sequel, whenever we refer to a natrix as being block 
circulant, it is in^licit that the blocks are 3x3 and that the natrix 
is nxn 2Mx3H, since these are the only block circulant natrices with 
which we will be concerned. 

Our results are based on the fact that if A is block circulant, 
then the Fourier natrix F defined in Eq. (4.18) block-diagonalises A, 
and vice-versa. That is, if A is of the form (5.28) then, defining 


A - FAF", 


(5.29) 


we have 


A - diag [A(- ” +1), A(- 1+2), ..., A(5)), (5.30) 


where the 3x3 raatrioes A((o) are given by 


M/2 

I 


A(«) - I „ „ 

3-4 +1 


— -j- +1 , ..., y . (5.31) 


On the other hand, given any 3x3 matrices A(w), and defining 

A - F*AF, (5.32) 

where A is given by Eq. (5.30), then A is block circulant. For proof, 
see Davis (1979, Theorem 5.6.4). 

The nxn matrix A therefore defines 3x3 matrices X(u), and 
vice-versa; It will be most convenient to work with the 3x3 matrices. 
Recall that A is supposed to be real, symmetric, and positive 


■tmldefinlte* We flret trani|,late theae condltlona into rcatrlcttona on 

X(u>). 


Ijerama 5. If the block clrculant natrix A In Eq. (5.28) la realt 
ayumetrlc, and positive aemidefinite, then the natrlcea X(d») in Eq. 
(5.31) satisfy 


X(-w) - A(u)), 

for 

(a) ■ 0| 1 1 • • s 1 

T ^ ’ 

(5.33a) 

>> 

1 

>>! 

1 



■ 

(5.33b) 

A*(w) » A(<d), 

for 

w , 1 

M i- - y 

M 

• • • » Y s 

(5.33c) 

X*A(o))x > 0 , 

for 

^ 4.1 

W - • Y • 

M 

T 

(5.33d) 


and for all complex S-vectors 


Conversely, given any 3x3 matrices A(u)) which satisfy Eqs. (5.33), the 
block clrculant matrix thereby defined In Eqs. (5.30,5.32) la real, 
symmetric and positive semidefinlte . 

Equations (5.33a>b) express the condition that A is real; they are 
similar to Eqs. (3.5). The conditions that A is symmetric and 
positive semidefinlte are expressed by Eqs. (5.33c,d); these equations 
state that A(u) oust be Hermitian positive semidefinlte . 

We would now like to translate the condition for uniqueness of the 
modified KB filteit'. 


-96- 

R n Keic A • ( 0) , 

Into an equivalent condition on the matrlcea X(u))* To do i0| we will 
have to be aonewhat more specific about the vectors £Q(m) which define 
the slow-wave subspace In Eq» (4*38)» 

Recall that the vectors Tq{^) are eigenvectors of f(w), 
(4t27). Since the submatrlces Vj which define $(u) in Eq> (4.24) are 
real. It follows that 

(5.34a) 
(5.34b) 

since the eigenvalues 6 q( 0) and real, Eqs. (4.28,4.33,4.37), 

it follows from Eqs. (5.34) that the eigenvectors can be chosen in 
such a way that 

tQ(-to) ■■ r q(w ) » for u) *■ 0|lpseep .j. -*1 , (5e35a) 

(5.35b) 

y 

That is, we can assume that £()(0) and cqX^) are real and that, for the 
remaining eigenvectors, £o(-ti)) is the complex conjugate of rQ(o)). We 
do not assume, however, that the eigenvectors are scaled In any 
particular way. 

Definition (4.38) of the slow-wave subspace can now be replaced by 


$(-u>) - V(w), for « ■ 0,1,..., y-1 , 
$(”l - f(”) . 
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R ■ |yj c w(w) ■P()(w)£o(u) foe all u ■ y, 


where p^)(w) are any icalara auch that pQ('n>>) • Pq^***^ 


and pQ(y) ■ Po^y^} * 


(5.36) 


where C” denotes the set of complex n-vectors; Eqa. (4*15) Imply that 
a complex n-vector w is real If and only if 


w(-ui) • w(u), for u •• 0,1,..*, y -I, 


(5.37a) 


''.rMv ''/Ml 

w(y) - w(,.). 


(5.37b) 


whence, by Eqs. (5.35), a vector w defined by 

w(w) ■ Po(w) Eo(w) (5.38) 

is real iff the scalars Po(w) satisfy the conditions in Eq. (5.36). 

Lemma 6 . Suppose the block clrculant matrix A in Eq. (5.28) is 
real, sytnmetric, and positive semideflnite , and define matrices A(u) by 
Eq, (5.31). Then the following three statements are equivalent: 

R n Ker A - {0} , (5.39a) 

A(u)) rg(u) |fc 0 , for w - -y'fl, y , (5.39b) 

ro(w)A(aj)rQ((o) > 0 , f or w ■ +1, y , (5.39c) 


0 
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Condltion (5.39c) tlnply ttate* that In addition to being 
Hermltlan positive aemldeflnlte, Iqa. (5.33c,d)« X(w) mat be poaltlve 
definite on the space spanned by the slow-vave eigenvectors rQ(u). We 
are now ready to state the min result concerning computation of H. 

Theorem 2 . Suppose the block clrculant matrix A in Eq. (5.28) is 
real, symmetric, and positive semldeflnlte. and define matrices X(u) by 
Eq. (5.31). Suppose further that 

R Ker A ■ {0} . (5.40) 

so that there exists a unique A-orthogonal projection matrix onto R . 
denoted by II . Then n is block clrculant and is given by 

n » F* n F . (5.41a) 

where 

n ^ dlag [ n(- Y+1)* 7 •••» (5.41b) 


il (ui ) w ro(w) ro(u) A(u)) . 


(5.41c) 




(5. 4 Id) 
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Notice th4it, according to Leraittaa S and (5i the hypotheaea of 
'nieorem 2 can be replaced by Eqa. (5t3$) and either of Eqt. 

(5.39b»c)* Ihla will be inportant In the next aectlon^ where we define 
projections directly In terns of prescribed matrices XCoi), rather than 
In terms of the corresponding matrix Notice also that Lemma 6 
guarantees that the scaling constants In Eq. (5*4ld) are 

well-defined* 

Theorem 2 gives an efficient method for computing Ilw for any w c 
R**, First, one finds Fw, l.e. , the discrete Fourier transform of each 
of the three M-vectors of which w is composed. This can be done In 
only 0(n log n) arithmetic operations, by use of the FFT algorithm 
(e.g., Brigham, 1974). Next, one awltlpll^ the resulting 3-M'ector 
Fourier conponents wCoj) by n^u), to find lIFw ; this takes only 0(n) 

▲ A 

operations. Finally, one finds Ilw - F IlFw by performing three inverse 
FFTa, taking an additional 0(n log n) operations. To carry out the 
second step, of course, one must have already computed the slow“wave 
eigenvectors rQ((D) and the matrices A(w)» this computation need only be 
done once and for all. 

In our discussion of variational normal mode Initialization In 
Sec. 1.2, we saw that different weights are usually specified over 

regions of different data densities. In this case A Is not block 

circulant, but we still have recourse to Lemma 4. To compute Ilw in this 

case, according to Eq. (5.8), one computes , then IIjA^^^, then 

a-i/2hiA*/ 2« - Hw. The second step is carried out according to Theorem 
2: IIj Is the projection matrix corresponding to the trivially block 

circulant matrix X. Computation of A^/^ and A“^^^ Is usually simple 
also, because one Is usually Interested In local weighting. In which 
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caie A l3 block-dlagonil with 3x3 bloeki» or oven diagonal, aa In the 
case of lunctionala (1«7, 1*8). At worat, therefore, one would have to 
coiqpute aquare roota of ?:<3 natrlcea. In the uaual eaae, In which A ia 
diagonal, only ocalar aquare roots are required. 

5.5. Choice of the Weighting Matrix 

We have seen that the modified KB filter depends, through the 
A-orthogonal projection matrix H, upon the weighting matrix A chosen 
for the error functional. We now describe several choices of A and 
discuss the projections to which they lead, ‘ namely, the parallel 
projection, the orthogonal projection and the minimum-energy 
projection. The latter projection is the one ehoien for the numerical 
experiments described in the following two chapters. 

First we Introduce some assumptions and notation. Eecall that the 
slov7-wave subspace R is defined in Eq. (5.36) In terms of the 
eigenvectors rQ((ij) of f(w), Eq, (4.27), and that the fast-wave 
subspace G is defined in Eq. (4.45) in terms of the remaining 
eigenvectors £tx(w). We assume that the eigenvectors have been chosen 
in such a way that 

£](■“> * (5.42a) 

Ej(^) - Jj(^), (5.42b) 

for j - 0,±1; we already saw that this is possible for j ■ 0 and, for 
the same reasons, such a choice is possible for j •* ±1. For simplicity 
We now assune that also 


f 
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rj(w)rj<w) - 1, w • - 


y •♦'I, ••• , y I J 


l.e., that all the eigenvectora have been norMllzed. 
Equation, (4>27) can be written ai 


t(w) " RCw) D(«) R“^w), 


where R(u) la the 3x3 matrix whose columns are the eigenvectors, 


R(»/^ ■ lr_i(w), ro(w), rjCu)), 


and where D(b)) Is the diagonal matrix of eigenvalues. 


D(oi) - diag I6_i(w), do^***)* ^iCw)]. 


We denote by ij(w) the rows of R^^Cw): 


R"kw) 




Clearly we have 


lj(“ti)) ^ lj(u), u ■ 0, y » 




(5.46b) 
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for j ilnca, for tx«nipl«ii Bq* (5*42a) !■ tqulvalont to R(-u) ■ 

R(u), and tharafora R"^(-w) • R”'(w). 

Tha ij(u>) ara laft aiganvactota of i*a#t 

tj(w) Y(oii) j(w) iJCw) f (5.47) 


vhlch follows upon pranultlplylng Eq • (5.44a) by R"^(w). Notlca also 

that tha equation R(ut)R*'^u) ■ I can be written as 

J-X^u) i*x(w) +rQ((d) 4o(w) +Si(w) ■ I» (5.48a) 

and that 

*t(w) tj(a)) • , (5.48b) 


which follows from R“^(u)R(w) ■ I. 

Before discussing the projections mentioned at the beginning of 
this section, we point out that the correspondence between ’'legitimate” 
weighting matrices A, i.e. , those satisfying Rn Ker A «■ (0), and 
A-orthogonal projections onto R is not one-to-one; rather, it is 
many-to-one. For example, if n is the unique A-orthogonal projection 
onto R for some block clrculant matrix A satisfying R n Ker A ■ {0} , 

then n is also the unique A'-orthogonal projection, where 

A' (oi) - A(w) + Y_i(w)i_i(w)i*i(«) 4 tx(w)li(w)t*(<o) , (5.49) 

for any real scalars is clear from Eqs. (5.46,5.48b) that 

if A(u) satisfies conditions (5.33,5.39c) of Lemmas 5 and 6, then so 
does A'((i>), while fromEq. (5.48b) we also have 
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r|^(u) A*(«) ■ t Q A(w) » 

from which Theorem 2 impllee that the project lone corresponding to X(u) 
and X'(u>) are Identical* 

It Is elmllariy verified that another auch equivalent weighting 
mtrlx la A*’, Miere 

!A”(w) - A(u) +I - (5v50a) 

or simply 

A”(w) “ A(w) +1 - £o^***^ (5.50b) 

in light of the scaling assumption (5.A3)* Although the modified KB 
filter Is chosen to minimise a certain functional n(A) of the analysis 
error* it will also minimize* for example* n(A') and n(A’0« 

The parallel projection . For the parallel projection It is most 
natural tO define the projection matrix first* and then to determine a 
weighting matrix A from which it can be obtained. 

By the parallel projection we mean the projection onto R along the 
fast-wave subspace G . That is, the parallel projection matrix Is that 
matrix II| such than for each x e R^» w - II|X has the same slow 
components as x* and no fast components: the projection Is parallel to 
the 6 -"axis". In other words, the parallel projection Is the one 
which corresponds to the non variational formulation of normal mode 
Initialization. A two-dimensional Interpretation of the parallel 
projection is given In Fig* 2. 
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A precise definition of the parallel projection Is as folloys* 
Since R end 6 together span all of It Is clear from Ec|s. 

(4, 45, 5. 36, 5.42) that the Fourier components of every x e can he 
written In the form 

1 

x(w) - 6 4 (w) xA(a) , (5.51a) 

5-1 ^ J 

for some scalars g 5 (ui) satisfying 

Pj(“tj) ■ p 5 (to), w ■ 0,1,..*, Y ’ (5.51b) 

65 ( 7 ) - Pj(y)» (5.5lc) 

for J “ 0,±1. The parallel projection matrix Is that matrix II, for 

which the Fourier components of w ■ HjjX are given by 

w(w) -Po(uj) rQ(w) ; (5.52) 

the slow components of x are unaltered and the fast components are set 
to zero. 

The parallel projection matrix is defined implicitly by Eqs. 
(5.51,5.52). Clearly there is at most one such matrix, for we have 
defined how it acts on all of R’™. It is also clear that Range H| »R 
and that n I (n I x) “ngx for every x e r", so that such a matrix nust 
Indeed be a projection matrix onto R . 

Now define 3x3 matrices X(w) by 


A(u) - » 


(5.53a) 
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and define the corteepondlng block circulrtnt mtrin 

)} 

A - fU/ (5.53b) 

where 

A - dtag [X(- I +1 ) , X(- j +2) , . . , , A(^)] . (5.53c) 

It is clear from Eqs* (5.A6) that A(b)) satisfies conditions (5.33a,b) 
of Itefdina 5; satisfaction of conditions (5.33cid) is obvious. The block 
circulant matrix A is theiafore real, symmetric and positive 
semldefinlte. Notice that A Is not positive definite; it has rank M “ 
n/3 since, for each u, the rank of A(u) is one. However, it follows 
from Eq. (5.48b) that 

£o(w) A(w) rQ(w) » 1 > 0 , 

whereby Lemma 6 Implies that R O Ker A ■ (0). There exists, therefore, 
a unique A-orthogonal projection matrix onto R ; we show that. it Is in 
fact the parallel projection matrix. 

With A(u)) given by Eq. (5.53a), and using Eq. (5.48b), it 
follows from Theorem 2 that the A-orthogonal projection matrix is that 
block circulant matrix for which 

nA(w) - ro(o)) iS(«). (5.53d) 

Letting w - n^x With x c R", we have 

w - Fft^x - (Fn^F*)(tx) - ii^x , 
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• o that, with x(u) given by Eqs. (5.51), we have 

w(tt) - ny^((i>) x(w) 

• I Pj(u) 
j— 1 

the last equality follows fromEq. (5»A8b). Comparing this result 

with Eq. (5.52), It Is clear that Eq. (5.53d) does give the parallel 
projection matrix, i.e. , - ll| . 

To summarize, the parallel projection matrix H| Is a block 

circulant matrix; It is defined by setting 

Hi (o») - ro((o) • (5.5A) 

The parallel projection matrix Is A-orthogonal for A given by Eqs. 
(5.53a-c). 

The orthogonal projection . The orthogonal projection matrix is 
the one corresponding to the choice A ■ I. In Sec. 5.2, the orthogonal 

projection matrix was denoted by IIj ; we now denote it by Ilj^ , in 

contradistinction with the parallel projection matrix . 

If A “ 1, then we can write 

A ■ clrc [1,0, . , , ,0] , 

whence, ly Eqs, (5.28,5.31), A(u») ® I for allu. According to Theorem 
2, the orthogonal projection matrix Hj^ is therefore block circulant, 
and Is defined by setting 
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nx<w) - ro(w) rJCw), (5.55) 

as the scale factor ■ 1 by assumption (5.43). 

It was shown In Sec. 5.2 that for every x e R”» the vector x - 
IIj^x, which is the vector between x and Its orthogonal projection onto 
R. is orthogonal to every w e R, i.e. « 

w'^(x-nj^x) " 0 for all w e R , x e R*'. 

Equivalently, for all u and for j - 0,±1, we have 

ro(w ) [r j (u) ) " n_L (w ) r j (w ) ] * 0 , 

which follows frotnEqs. (5.43,5.55). The orthogonal projection in two 
dimensions is illustrated in Fig. 2. 

The orthogonal projection is not appropriate for oar discrete 
system (4.7). With A - I in the error functional (5.16), we see that 
the modified filter based on the orthogonal projection would minimize a 
dimensionally inconsistent sum of squares, l.e. , squares of the 
velocity components (m/s) and of the geopotential (m^/s-). 

Comparison of the parallel and orthogonal projections . Having 

defined the parallel and orthogonal projections, we wish to make it 
clear that these two projections are not the same, II| IIj^ , and to 
clarify why this is the case. 

Since the Fourier matrix F is nonsingular, it is clear that Jtj ■ 
if and only if n, ((»>) -nj^(a)) for allu. According to definitions 
(5.54,5.55), the latter equality holds if and only if 
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|0<«) ■ Eo^***) 


(5.$6a) 


Notice that 1£ Eq. (S.S6a) is satisfied, then there Is no 
conflict between the matrix A(o)) ■ which we used to obtain 

the parallel projection and the matrix A(w) ■ I which generates the 
orthogonal projection. From Eq. (5.50b) it follows that the parallel 
projection Is also A”-orthogonal, where 

A"(w) “ + I - » 


and If Eq. (5.56a) Is satisfied, then A”(ti)) ■ 1. 
Now, fromEq. (5.45), we have 

lo(u)) - lR"kco)]* £q , 


where bq is the vector (0,1,0)^; therefore Eq. (5.56a) is equivalent 
to 

£0(01) - iR^^(uj)]* Cq , f-tallw, 


or 


R (u))rQ(o)) ■ 6 q , for all u , 


or 


cJ|((i))rQ(u)) ■ 0 , for all ta , 


(5.56b) 


jAT' 

since we already assuned rQ(w)rQ(u)) “1, Eq. (5.43). By definition of 
the fast-wave and slow-wave subspaces, Eq. (5.56b) Is equivalent to 
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5*^Z “ ® 5 ® ®» Z ® ^ • (5.56c) 

Thu8» the parallel and orthogonal projections are identical if andlionly 
if the fast-wave s-'ubapace is orthogonal to the slow-wave subspace. 
This statement should also he clear from the geometrical Interpretation 
of the {parallel and orthogonal projections indicated in Fig. 2. 

the axes in Figure 2 are drawn obliquely because the eigenvectors 
of our discrete model do hot satisfy Eq. (5.56b): the fast-wsve and 
slow-wave subspaces are not orthogonal* Frimitive-equation models 
linearised about a state with nonzero mean flow also have nonorthogonal 
fast-wave and slow-wave subspaces (Kasahara, 1981). 

This nonorthogonality Is not an artifact of discretization. 
Rath^:^i lb is a property of the differential equations. 

Nonorthogonality of the fast-wave and slow-wave subspaces of the 
shallow-water equations (3.1) Is due to the asymmetric form of the 
equations and to the appearance of the term (-fUv) In Eq. (3.1c). 
This term arises because the solution about which Eqs. (3.2) were 
linearized has 4^ 0, i.e. , a free surface with nonzero slope in the 

meridional direction. 

With the term (-fUv) removed from Eq. (3.1c), the change of 
variables 

u /♦ u , V V- V , ♦ ♦ (5.57) 

In Eqs. (3,1) results in the symmetrized shallow-water equations, 
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+ UUj5 + /♦ f V • 

0 . 

(5.5Ba) 

v^ + Uv^ + £u ■ 

0 , 

(5.58b) 

♦ t + “x 

0 ; 

(5.58c) 


cft Krelss and Oltger (1973, Ch. 7). For this ayatem, cprreapondlng 
to Eq . (3.9b) we have 



” 5U 

if 

5 /* " 

G,(0 - - 

-if 


0 


Cv'* 

0 



Im 



Since Gp(5) is Hermltlan, l.e.. 

- 


> 


its eigenvectors are orthogonal, and therefore the slow-wave and 
fast-wave subspaces of the continuous system (5.58) are orthogonal. 
Since Gg(^) 1 b Hermltlan, it Is also normal , l.e. , 

G*(0Gg(0 - Gg(0Gj(0 . 

A matrix has a complete set of orthogonal eigenvectors If and only if 
It Is normal. The matrix G(4' ) given by Eq. (3.9b) Is not normal, and 
the slow-wave and fast-wave subspaces of the original system (3.1) are 
not orthogonal. 

As a consequence, the slow-wave and fast-wave subspaces of the 
discrete model (4.7) are not ortfogonals the symbol $(u) given by Eq. 


-m- 


(4.24) is not 9 norml matrix and Its elgsnvectors do not satisfy Bq. 
(5.56b). The Rlchtn^er tW"#tep scheme for the modified System (5.58) 
does have a nofnsl symbol.; end therefore has orthogonal slow-wave and 
fast-wave subspaces. 

We hope to have iXlustrated by means of this example that 
continuous models having orthogonal fast-wave and slow-wave subspaces 
are special: the equations nust be written In symmetric form and must 
have been linearized about a specially chosen state. Orthogonality may 
or may not carry over to a corresponding discrete version of the model; 
one nust still check to see If the discrete model's symbol is normal. 

The minimum-energy projection . For the numerical experiments; we 
choose the modified filter to minimize a physical quantity; namely; the 
expected energy of the analysis error. The energy of solutions of Eqs. 
(3.1) Is proportional to 

li/2 

j (u^ + v^ + VC') dx . 

-L/2 

We choose A to be the diagonal matrix with the elements (1;1;1/*) 
repeated along Its diagonal; the corresponding error functional ti 
represents the e:>q)ected energy of the analysis error, and we refer to 
the corresponding projection as the minlnum-energy projection. The 
minimum-energy projection, denoted by Hg , is distinct from the 
parallel and orthogonal projections, and Is depicted in Fig. 2. 

The minimum-energy projection was computed by the method of 
Theorem 2. The weighting matrix A Is block clrculant, 

A ■ ClrC [Aq;0;...,0] 




(5.59a] 
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where 

Aq- dlag (1,1,1/* ) . 

whence » from Eq. (5.31) » 

A(o») - dlag (l,l,l/*) . 

According to Theorem 2, therefore^ the 'minimum-energy 
la the block clrculant matrix II^ for which 

ng((D) - o^, ro(w) EoCw) dlag(l,l,l/*) , 

where 


(5.59b) 

(5.59c) 

projection matrix 

(5.60a) 

(5.60b) 
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CHAPTER SIX 

EXPERIMENTS WITH THE STANDARD AND MODIFIED KAmAN-BUCY FILTERS 


Numerical experlmenta with the standard and Modified KB filters 
will now be described. The results show that the modified filter 
produces slowly evolving state estimates, at the expecue of estimation 
errors only slightly larger than those resulting from use of the 
standard KB filter. Results show also how each filter utilises the 
information advected between data-dense and data-sparse regions. The 
importance of proper use of advected Information will be further 
demonstrated in Chapter 7. 

6.1. ObservinR Pattern, Noise Covariances and Initial Data 

We complete the description of our implementation of the standard 
and modified KB filters, Bqs. (2.22, 5.24), by choosing an 

observational pattern , noise covariances Rj^ and , and initial 
data wg and ; the dynamics matrix T and projection matrix II ■ Hg 
have already been described. 

To recapitulate, the projection matrix is given by Eqs. (5.60): 
by the symbol II we now always mean the mlnlnum-energy projection Hg . 
The dynamics matrix y is given by Bq. (4.7b); the parameters f , U, ♦ 
and L are given following Eq. (3.3), and the mesh parameters are M ■ 
16 grid points and At - 30 min., as mentioned in Section 4.4, 

Discretization With 16 grid points leaves a computational problem of 
easily manageable size. The corresponding choice of At * 30 min. is 
near the stability limit of the difference scheme, and results in r ■ 
24 time steps per synoptic period. An experiment using 32 grid points, 
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for a spatial raaolutlon clooer to that of operational NWP nodell» gaVa 
results quite simile^ to the comparable experiment with 16 points* 

Wb study an obaervlng pattern corresponding to the conventional 
meteorological upper~ai.r networks all quantities (u^Vi^) are observed 
over "land"* at synoptic times* and there are no observations over the 
"ocean"* The distribution of land and ocean at latitude 6 ■ 6 q * where 
the earth's circumference Is 2L* Is simplified to be 2-perlodlc* so 
that half of each Interval of length L Is covered by ocean (Pacific or 
Atlantic)* and half by land (North America or Eurasia). For this 
reason we consider only 2-perlodlc solutions of 'Eqs* (3*1) » and our 
computational domain Is of length L ; cf. Eq. (3.3). 

We consider the left half of the computational domain to be 
covered by land* and the right half to be covered by ocean. For 
simplicity we take the observing stations to be located precisely at 
grid points. The observation matrix is therefore 

in (I 0) (6.1a) 

when k Is a nultlple of r * 24 time steps* and 

\ * 0 (6, lb) 

otherwise; observations are available at synoptic times only, i.e. * 
every twelve hours. 

For a single wave number w* initial data for the continuous system 
(3.1) which lead only to slow waves are given approximately* according 
to Eqs. (3.29)* by 
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♦ (x,0) ■ 4 q iin 5x » 


u(x,0) - -- — _ iin Cx , 
C2#+f2 


vvXjOJ ■ cos ^X f 


( 6 . 2 «) 

(6.2b) 

(6.2c) 


where C ■ We choose initial data corresponding to a single 

Rossby wave with wave number w - 2, i.e., 5 ■ %/L, and aiq>lltude - 
2.5 X ICP m^/s^. The latter is in accordance with a typical 
ridge -to- trough difference of 500 m in V.he height of the 500 millibar 
pressure surface (Palnen and Newton, 1969, Sec. 6.6). It follows that 
• 1/12, which partially Justifies th-^ linearization of Eqs. 
(3.2). 

It follows also tliat the amplitude of v(x,0), 

Vq - S 1.122 U , (6.3) 


is roughly equal to U, a realistic value. Note, however, that the 
amplitude of u(x,0). 


“0 


c2*+f2 


0,059 U , 


(6.4) 


is relatively small. This is in agreement with the results of Section 
3,6 j due to the absence in Eq. (3, lb) of a pressure-gradient term to 
balance the Coriolis term, the continuous and discrete slow-wave 
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■ubtptc«t have rather eiiall \i-conponenta . the u-confiQiient ie apeclal, 
and lt§ behavior during an aaalmllatlon will provide one convenient 
check of the departure of eitlnatea from the elow->rave aubapace* 

Initial data for cxperliaents with both the atandard and modified 
filters were obtained by evaluating w(x (M/2 - l)Ax, 0), given by 
Eqs. (6.2), at the grid points “ jAx* J *' Hd/2+l, . ..,M/2. Denoting 
the result by w^, we then set 

w|5 - H wj , (6.5) 

In accordance with Eq. (5.14a). the difference between Wq and w§ is 
Indicative of the difference between the discrete and continuous 
slow-wave subspaces, and of the degree of approximation In Eq. (3.28). 
Wa found a small but significant difference between ^ and So • 
particular, the anplltudes of the u-, v-, and f -components of wg are 
0.993 uq , 1.016 VO and 0.960 • 

As we have already pointed out following Eqs. (2.22, 5.14), the 
gain matrices of the standard and modified KB filters are independent 
of the state estimates. In particular, the gain matrices are 
Independent of w^. Thus the choice of w^ is made primarily for 
orientation purposes, and similar results will obtain for any Initial 
estimate satisfying Eq. (5.14a). 

The observations, made twice per day over the eight grid points 
located on “land”, _< 0, are assumed to have errors uncorrelated in 
Space, as well as in time. That Is, we take the observation error 
covariance matrix to be diagonal. The observation error variances, 
or diagonal elements of , are taken to be constant in time: Rj^ ■ R •* 
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contt, wh«ti k Itt a okilfclpla of 24 tloMi atapif Tha varlaneaa art alio 
conttant In space » and their values are based on data from McPherson at 
aa. <1979, Table 2). 

The standard deviation of conventional tenperature observatlohs 
used there Is l^C, This can be converted, based on the cuetonary 
hydrostatic assuo|>tlon, to a 500 millibar level geopotential error of 
approximately 200 m^/s^. This value corresponds to an error of about 
0. 1 A corresponding 10 ? error In the wind components, relative to 
Vq , Is roughly 2 m/s ; this Is slightly larger than the value of 1.5 
m/s used by McPherson et al . (1979). We take the standard deviation 
In observations of 4 to be 200 m^/s^, and that In observations of u and 
V to be 2 m/s. Relative errors In all observations are thus about 10%. 
The Initial error covariance raatrln Pg la taken to have the form 

p§ - n dJ + (I - n) d| (i - ID'^ , (6.6) 

which results fror the assumption that the slowwave and fast-wave 
components of the Initial error ate uncorrelaCed: 

wg - wg - + (I-n)X2 » (6.7a) 

where 

®ZiXj " *^i ^ij • <6. 7b) 

This assumption Is made for convenience, and because of lack of 
Information on the cross-correlations of the two types of errors; It 
can, of course, be easily removed. 


- 118 ^ 


Denoting by D the diagonal matrix with the elements (vq»V 0 , 4 o) 
repeated on the diagonal, we take 

Dj»0,4D, D2"0,1D « (h*8) 

Thus the initial error covariances are uniform over the entire domain 
and the initial error variances are nuch larger than the observational 
error variances. Most of the initial error lies in the discrete 
slow-wave suhspace; the true initial state has only a small fast 
component, * This uniform distribution of initial error will 

make it easy to visualize the reduction of error resulting from the 
first synoptic, observation. 

The form chosen for the system noise covariance matrix is 
similar to that of We take to he constant, ■ Q» with 

Q - n D§ + (i-n)Dj(i-n)’^, (6.9a) 

where 

D 3 - yD , D 4 - 0.25 y D. (6.9b) 

The parameter y Is chosen on the basis of atmospheric predictability 
studies, cf. Sec. 1.2 and the discussion following Eq. (2.2a), as 
fellows. 

Suppose that {wg} and {w|^} ars two realizations of our 
stochastic-dynamic model, Eqs. (2.2, 4.7b, 6.9), starting from 

identical initial states, wg ■ Wq. The covariance matrix 

\ - E(wg - wjKwg - w{)T 


( 6 . 10 ) 


vr-- 
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•volvas according to 

Xfe - t X|j_i + 2 Q » (6.1U) 

Xq - 0 ; (6.Ub) 

c£> ‘i£q. (2.10a). Hie expected energy of the difference between the 

two realiuitions is 

“ Sk>'^ A<!ik " Hk) * 

where K is the energy-weighting matrix given by Eqs. (S.59a,b). We 
have 


■ trace AXj^ (6.13a) 

Eq ■ 0 ; <6. 13b) 


cf. Eqs. (2.12,2.13). 

The correlation matrix of the two realisations is given by 

Cr - ECsEXsJ)'' . <‘-' 4a) 

and evolves according to 

Cj^ - f Cy^_j . (6.14b) 

The correlation matrix starts from a nonzero value and tends to zero as 
k as a result of the dissipation of the dynamics matrix Y . 

For our stochastic-dynamic model to have the same predictability 
as the atmosphere, we would like the two realizations, perfectly 
correlated at the Initial time, to become nearly uncorrelated. 


after a finite predictability time p of about two to three waeka. If 
Cp - 0 exactly, then we would have 

Xp ^ E(wp(wJ)T + E(wt)(„t)T 

and 

■ E(w®)T 4. E(wJ)'^^ A,(wp. 

That Is, would grow from zero at time zero to an amount at time p 

equal to twice the expected energy e| of either w® or w^ , 

eJ - 2 e2 . (6.15) 

Of course Eq. (6.15) would be true of functionals other than the 
energy. For simplicity we base the choice of Q on only one free 
parameter, Q “ Q(y), so that the growth of only one functional can be 
prescribed. 

We regard the energy eJ as a "typical" ettergy, as system (2.2a) Is 
not conservative. We take the typical energy to be that of w^ , 

E* ■ trace A(wg)(wg)'^ , (6.16a) 

with w§ given by Eq. (6.5). Our assimilation experiments are run for 
10 days, l.e. , for N - 480 time steps. We determine y, and hence Q(y), 
by specifying a parameter o close to one and requiring 
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Ug - 2ahi • ’ (6.16b) 

Thust, convenience we specify tlie amount of loss of predictability 
over the length of an assimilation run. 

We set a ■ 0.7, and found y “ 0.028, This was easily done by 

trial and error: according to Eqs. (2.10a, 6.11, 6.13), 

■ 2 trace APfj , (6.16c) 

where Pj^ is the estimation error covariance at the end of a run with 
zero initial error and no observations. Thus, we computed E^ In Eq. 

(6.16c) for various choices of y, until Eq. (6.16b) was satisfied. 

1 

The choice of a ■ 0,7 corresponds to a 70% rms loss of 

predictability at time k ■ N. This value of o, rather than a value 
closer to one, was chosen because N < p: continues to grow after 

time N and Cj^ continues to decay. The levellng-off time of Ej^ , and 
the decay time ot Cj^ , is much longer tlian N and Is a function only of 
the amount of dissipation In the dynamics matrix P ; cf. Eq. (6.14b). 

To complete the description of our assimilation experiments, we 
note that the observations wg in Eqs. (2.22e, 5,24e) were obtained 

from an actual realization of the stochastic-dynamic system (2.2, 2,3), 
That Is, We generated independent random vectors having the 

prescribed covariances, Pq for wq-Wq, Q for and R for and 
accordingly obtained random vectors wg from Eq, (2.3a). 
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6*2. Numertcal Results 

In Figure 3 we show at selected grid points the tins histories of 
the estimates k ■ 0,1,2, produced by the KB flltef; 

u-conponents of the estimates are shown In Fig# 3a, v-conponents In 

Fig, 3b, and ^“components In Flg» 3c, We show a point on the West 

coast of the continent, labeled SF (for San Francisco), one on the East 
coast, labeled NY (for New York), and one In the middle of the ocean, 
labeled HA (for Hawaii). Note that "Tokyo'* - "New York" by 

periodicity. The ordinates In Figs. 3a, b are scaled by Vq , Eji. 

(6,3)/ and the ordinate in Fig. 3c Is scaled by , 

On each curve, email-amplitude fast oscillations are superimposed 
on a smooth, slowly varying wave pattern. The fast oscillations are 
caused by the Introduction of noisy observations of the true statei the 
true state has a fast component due to that of the system noise and 
that of the initial otate wj. The fast oscillations are especially 
apparent in the u-comjjonents , Fig. 3a; recall from Sec. 3.6 and from 
the discussion following Eq. (6. A) that u-components are very 
sensitive to departures from the slow-wave subspace. Notice in Figs. 
3b, c the underlying periodicity of about 6 days. This Is In agreemai^t 
with the phase speed Cq( 2) shown in Table 3* 

cq(2) - 13.12 m/s = 12.'35 Hayi ^ 

one-half of the 2-wave we are estimating passes through the L-domaln In 
just over 6 days. 

For the assimilation run with the modified filter, the results 
corresponding to Fig, 3 are shown in Fig, A, The time histories of 
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the estlmtes In this case are perfectly smooth, apart from the jumps 
due to observational Insertions: the estimates evolve entirely in the 
slow-wave subspace* The 6iday periodicity is now evident even in the 
u-components, Fig* Aa* The u~coR^onents change very little at 
observation times: the corrections added to the forecast at observation 
times lie in the slow-wave subspace, and therefore have small 

u-components. By contrast, the corrections to v and ^ observation 

times, Hgs. Ab,c, are often quite large at SF and NY, The 
corrections at HA are nuch stoaller because no observations are made 
there* Corrections at HA are due only to, and are nade to a degree 
consistent with, the correlation of the forecast error at HA with the 
forecast error at observation stations inland. 

To study the behavior of the estimation error for the KB filter 
experiment, we show in Fig. 5 conponents of the expected rms 

estimation error resulting from use of the KB filter. Figure 5a shows 
the expected rms error over land. Fig. 5b over the ocean, and Fig. 5c 

over the entire domain* The Individual curves are labeled U, V, P and 

E, for the expected error in u, v, srid the total energy, averaged 
over the indicated region. Thus, the U-curve in Fig. 5a gives the 

square root of the average of the first eight u-coi^onents along the 
diagonal of The ordinate in each panel is scaled by Vq for the 

U- and V-cur\es, by (|.q for the P-curve, and by 2v§ + for the 

E-curve. Ihus, the observational error level in each panel is 0*089 
for the U- and V-curves, 0*080 for the P-cutva, and 0.088 for the 
E-curve. 

The errors over land. Fig* 5a, drop below the observational error 
level immediately, at the first synoptic time* In fact, the error 
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reduction over land at each eync^tlc tine is draoatlci and reeult>s In 
errors below the level of observational noise* The error reduction 
over ocean at each synoptic time, Fig. 5b, is less pronounced but is 
still significant: the KB filter is able to spread out the new 
information from observations over land to adjacent ocean areas. 

Also striking is the difference between Figs. 5a and 5b in the 
error growth between synoptic times. The large increase of error over 

land in between synoptic times, shown in Fig* 5a, is due to the 

combined effect of system noise and advection of error from over the 
data”sparse ocean. The much milder increase of error over the ocean in 

between synoptic times, shown in Fig. 5b, is due to partial 

cancellation of these two effects: the effect of system noise Is still 
the same, but relatively error-free Infoirmatlon is being advected from 
over the data-dense land. 

Notice also that the curves shown in each of Figs. 5a,b,c quickly 
settle into an asymptotically periodic pattern, with the synoptic 
interval of 12 hours as the period. This behavior is typical of 
time- independent models (H',Q) with periodic observations The 

convergence occurs within about one day over land, and in about 5 days 
over the ocean. In fact, the KB gain matrices used at observation 
times tend rapidly to a constant gain matrix, K^® 1^®. 

Expected rms errors for the experiment with the modified filter 
are shown in Fig. 6. The errors in v and <|i, over both land and ocean, 
are nearly Indistinguishable from those of the standard KB filter: 
slowly evolving estimates are obtained at the expense of only a very 
slight Increase in estimation error. Errors in the u-component, 
however, are significantly larger than in the case of the standard 
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filter* The u-conponent errors resulting iron use of the nodified 
filter gr<M in tine and, after 10 days, are about twice the size of 
those resulting from the KB filter. The larger u-conponent errors are 
due to the fact that the modified filter allows almost no observational 
correction to be performed on the u-conponents : the u-cosponents must 
remain small for the analysis vector to lie In the slow-wave subspace, 
Ihe u-conponents of the true state, on the other hand, are growing 
because of the contlnnal input of the fast component of the system 
noise. 

To visualize better the behavior of In time and to study the 
structure of KjP, we plotted the Influence functions of selected 
observation stations. The Influence functions of an observation 
station show the weight given to an observation of u, v, or ^ at that 
station when updating points throughout the' domain. The influence 
functions at time k are obtained from appropriate columns of kJ^®. 

The chosen observation stations were SF, SL (for Saint Louis) and 
NY. There i> e no influence functions for mid-ocean points, like HA, 
since no observations are made there. Influence functions were plotted 
at every synoptic time, l.e., every 24 time steps. It was clear that 
convergence to k]® occurred within about 5 days. 

Figure 7 shows the Influence functions for the selected 
observation stations at the end of day 10. Figure 7a, marked (u-u), 
gives the influence of a ^ observation at the selected stations on ^ 
corrections at every grid point in the domain. Figure 7b, marked 
(u-v), gives the weight of a observation at a station on the 
corrections at every grid point, and so on. The variables have been 
scaled in the usual way, u and v^ by Vq , and ^ by . 
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All the >ielghtlng coefficients involving n ere rather small (Figs. 
7a»b|C»d»g). Our choice of system noise covariance matrix Q» with the 
4-to-l ratio of Dj to , entails relatively good predictions of u, 
Uilch have to he corrected only to a small extent hy the observattoju. 
Ihe (u’ni) coefficients (Fig. 7s) are the largest of the coefficients 
Involving u,; th^ still do not exceed 0.125. The (u-m) influence 
functions are approximately equal for SF, $L and NY; they are positive 
and symmetric about the observation station. Th^ are the only ones to 
have both of the latter properties. 

The (^- 1 ^) Influence function centered at SL Is the smallest one 
shourn in Fig, 71. It Is positive over land, becoming nearly zero at 
SF and NY, and slightly negative out Into the ocean. The symmetry and 
relatively small peak of this function Is due to Its station, SL, being 
located In the middle of a data-dense region: neighboring points also 
have obseirVatlon stations and advectlon plays but a small role. 

The peaks of the (i}i“<^) Influence functions centered at NY and at 
SF are considerably higher than the SL peak. This Is due to the 

absence of observations on the ocean side of these stations. In fact, 
the peak of the SF function Is slightly higher than the NY peiik. 
Moreover, the former Is located one grid point west of SF, rather than 
at SF itself, while the NY peak Is at NY. Both data density and 

advectlon thus play a role. 

It makes Sense for the point upstream of SF to give even more 
weight to SF Information than SF Itself: SF Is closer to Inland points 
and their Information Is also weighted heavily. Due to the advectlon 

of error, the forecast error at synoptic times for this ocean point Is 

considerably larger than that for the point downstream from NY, 


although they are equldlatant from land. Hcnoe» more walght; le given 
to adjacent land obaervationa for the Pacific point than for the 
Atlantic point. 

As in Fig. 7i, the (v-v), (v-^) and (♦“v) influence functions 
(Figs. 7e,f,h) all shov strong inhomogeneity differences between the 
SF, SL and NY functions, as well as anisotropy differences to the west 
and east of each station. The SL influence function io nearly 

symmetric for (v-v), and it is nearly antisymmetric for (v-^) and 

(^-v); the corresponding SF and NY influence functions do not have 

these symmetry properties. We will see in the following chapter that 

this differential treatment of observations located in the middle of a 
data-dense region (SL) and observations on the border between 
data-dense and data-sparse regions (SF and NY) is important for the 
proper performance of data assimilation schemes. 

The SL influence functions at the first synoptic time (Fig. 8) 
are either perfectly symmetric (u-u, u-v, v-u, v-v, and 
perfectly antisymmetric (u-r}», v-i}), ^-u, and ^-v). Similarly, in each 
panel of Fig. 8, the NY Influence function is either the mirror image 
or the Inverted mirror image of the SF function. 

Comparison of Fig. 8 with Fig. 7 allows us to distinguish 
between the effect of inhomogeneous data density and the effect of 
advection. Figure 8 shows the effect of data distribution only, since 
at the first synoptic time no information has been advected yet from 
previous data insertions. Figure 7 shows the combination of the two 
effects. 

Different data densities result in different influence functions 
according to station location (Fig. 8): stations located in sharp 


gradients of observation a vallabiXity» such as SF and NY» have vote 
Influence than Inland stations (SL), and their Influence out to sea is 
also greater than their Influence inland* It is advection» howevern 
which leads to the difference between the influence functions of 
stations on the West coast (SF) and East coast (NY). The latter 
difference was discussed in connection with Fig* 7i» and Is also clear 
in Figs* 7e|f,h. 

The corresponding results for the modified KB filter (not shown) 
are very similar to those for the standard KB filter shown in Figs. 7 
and 8* The main difference is that the (u-u)* (u-v) and (u-f) influence 
functions are almost perfectly flat* The modified filter allows even 
less correction to the u**conponents than does the standard filter: the 

estimates are forced to remain In the slow-wave subspace* 

The KB gain matrix at day 10, a good apptoxlma tlon to the 
asymptotic gain matrix was used as a constant, tlme-*independent 
gain matrix In another assimilation experiment* Estimation errors 
after 1-2 days were practically indistingulslmble from those obtained 
when using the KB filter. Similarly, a run using the final gain matrix 
of the modified filter gave results almost identical to those of the 
modified filter Itself* There is therefore no need, in our 
time-independent model (f ,Q,R), to compute a new gain matrix at every 
synoptic time: approximate computation of the asymptotic gain matrix 
once and for all is sufficient for practical purposes* The asymptotic 
KB filter, known as the Wiener filter , is analyzed more fully in Ghll 
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CHAPTER SEVEN 

COMPARISON WITH OPTIMAL INTERPOUTION 


7. 1 . The Optimal Interpolation Filter 

We have already pointed out that the difference hetueen optiaal 
interpolation (01) and the KB filter la that Ot la haaed on an aaauned» 
preacrihed forecaat error covariance niatrix» rather than on the 
covariance matrix P|[ which reaulta from aaaumption of a 

atochaa tic-dynamic model (2.2)* He denote the preacrihed covariance 
matrix 1y and base our formtlation of s|^ upon the 01 achemea uaed 
at NMC (Bergman. 1979; McPherson et al », 1979) and at ECMWF (Lorenc» 
1981). 

We implement 01 for our usual forecast model (A. 7). 

2k “ '*'2k-i ; 

as 01 is an unbiased linear data assimilation scheme, the 01 update 
equation can be written as 

2k “ 2k S<2k “ 

cf. Eqs. (2*7). Optimal Interpolation schemes are derived by 

minimizing the analysis error variance at every grid point, assuming an 
observational error covariance matrix and a forecaat error 
covariance matrix s]^. The 01 gain matrix. is therefore 

Identical to the KB gain matrix, «d.th P|^ replaced by s|^: 

■ s£"kt%Sk«k + «k>'‘ i 


(7.2) 
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ef» Eq. (2.20a) » 

With as the forecast error covarisnce mtrlx and with “ k|^^» 
it follows that the 01 analyais error covariance natrix S|^ ie given by 

Sj - (I - ; <7. 3a) 

cf* E<,. (2.21). We also define the diagonal laatrix 

Dj - diag Sj (7.3b) 

of analysis error variances. Equations (7.1b»7.2^7.3b) are Identical^ 
respectively, to Eqs. (2.5,2.12,2.13) in Bergnan (1979), although we 
use a more compact notation. 

It remains to describe how the forecast error covariance matrix S|[ 
is formulated in OX. Every covariance matrix S can be decomposed In 
the form S ■ b1/2cbV 2 yhere B •* diag S is a diagonal matrix of 
variances and where C « B“^/2sB”’^/2 jg g correlation matrix. In 01 it 
is assumed that the forecast error correlations are time-independent, 

Sjf - (D|^)l/2c(Df)l/2 ; (7.4a) 

the correlation matrix C is constant and is a prescribed matrix in 01. 

The forecast error variances, 

dJ - diag sj , (7.4b) 


are assumed to grow linearly in tine, i.e. , 


(7.5) 


whtn r Is the length of the esslmiletlon cycle* r - 24 tlae steps In 
our c'ts^* The time-independent diagonal Mtrlx D Is a prescribed 
forecast error growth rate matrix; cf» McPherson et al » (1979, Sec. 
2c.4). Equations (7*4, 7. S) deacrlbe the evolution of S|[ ; they can be 
regarded ar an approximate version of Eq. (2.22b). 

Except for choice of the natrlces C and D, Eqa. (7. 1-7.5) define 
the Implementation of 01 for our shallow-water model (f 
we take S^ « P^. We refer to Eqs, (7.1-7.5) as the 01 filter . For 
experiments with the 01 filter we also compute the true forecast and 
analysis error covariance matrices, 

pj - + Q , (7.6a) 


With \ - k2^; cf. Eqs. (2.10). 

We define also an Initialized 01 filter, having gain matrix 

KfOT - n Kg^ . (7.7) 

where H Is the usual mlnlimim-energy projection. In this case, Eq. 
(7.3) Is replaced by the general formula 

Sj - , (7.8) 

With -> For experiments with the initialized 01 filter we also 

compute the true error covariances (7.6), with Kj^ ■ 


The standard assumption by which the correlation matrix C Is 
determined Is that mass field (geopotential or temperature) 
correlations are homogeneous, Isotropic and Gaussian, with the 
remaining correlations derived by assuming that forecast errors are 
geostrophlcally related (Bergman, 1979, Sec* 3; Lorenc, 1981, Sec* 
4b)* For our model (3*1), the geostrophlc relation Is 

u - 0 , V - 4jj/f , (7*9a,b) 

and therefore the assumed forecast error correlations, which define the 
elements of C, are given by 


- exp [-(xi-xj>^/s§l , 

(7* 10a) 

<Ts - 1‘ - 2(»l-Xj)V»§l Cfj . 

(7* 10b) 

cij - n l(xi-xj)/,ol , 

(7* 10c) 

- - ct5 , 

(7*10d) 


for <hl2 on our periodic domain; we take for the correlation 

distance Sq ■ 1000 km, In agreement vdth the value used at NMC* 

The five correlation functions not specified In Eqs* (7*10), 
l*e* , those Involving ju, are all zero due to the geostrophlc assumption 
(7*9a)* In our version of 01, therefore, forecasts of ^ are not 
changed at analysis tiroes, nor are the v and ^ analyses affected by 
observations of tr. This Is not unreasonable for our model; we have 
already seen that the same Is approximately true of the standard and 
modified KB filters, as a result of the A-tO“l ratio of Dg to In Eq* 
(6* 9b) and since the slow-n^ave subspace Is quasi geos trophic* 
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In our first set of experiments, the diagonal forecast error 
growth rate matrix D was chosen in the following way. We have already 
seen that the KB gain matrix tends to a constant matrix, and that the 
corresponding error covariance matrices tend to an r-periodlc 

sequence, ■ Pkl®* At the end of the KB filter run, k “ 460 time 

steps, we averaged the diagonal elements of P,^ - Pg.^ over the entire 

spatial domain. Thus, we found for the KB filter the average 12-hour 
growth rate for the variance of each of the three variables u, v, ^ • 
These averaged rates were then used as the diagonal elements of D for a 
prellmlnaiy run with the Ol filter. The assumifed growth rates for this 
first 01 run, and for the runs we describe next, are therefore 
Independent of the longitude x, as is the case with the 01 scheme 
described by McPherson et al . (1979, Sec. 2c.4). 

Using Eqs. (7.6), we computed next the true growth rates 

{ dlag(P|f-p^_j.) ; k is a nultiple of r} produced by the preliminary 01 
run, and we averaged them over the spatial domain. These averaged 
rates, although relatively constant after about 5 days, were somewhat 
different from the originally assumed growth rates. In order to 
produce a control run for 01, we made the following succession of 
lO-day 01 runs: the space-averaged true growth rates for each run, 
starting with the first run described already, were averaged in time 
over the last 2 1/2 days and used as the assumed growth rates for the 
next run. This procedure converged rapidly to our control 01 run, 
which we call run Aj, 

Thus the true growth rates for run Aj , averaged over days 7 1/2 — 
10, agree with the prescribed growth rates D. This corresponds to the 
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aiBuned rates being specified by an "optimal verification" scheme. We 
denote these growth rates by d^ ^ and d|. 

Having determined the control run we performed next a series 
of 01 runs , to study how the true error covariances Pj|»® depend on 
the assumed growth rates. Run A^ uses the elements <(od„)^, (o‘*v)^> 
(ad^)2) repeated along the diagonal of D. We used values of a ranging 
from 0 to 2 in Increments of 1/4. 

In an analogous fashion, we determ:i\ned an initialized 01 filter 
control run (K ■ using the Initialized KB run (K ■ as 
the starting point. We then performed the corresponding series of runs 


7.2. Numerical Results 

In Table 4 we summarize the results of the 01 runs A^ and B^. For 
comparison, we also include the corresponding results of the run with 
the KB filter, which we refer to as run Aj^ , and of the run with the 
modified KB filter, which we refer to as run Bj^ j these are the two 
runs described in Chapter 6. 

The table entries are the true expected r ms analysis errors at 
selected grid points after 10 days, l.e., they are the square roots of 
selected diagonal elements of the analysis error covariance matrix P^gQ 
produced by each run. The selected grid points are at Saint Louis (SL, 
X ■ -34x), Hawaii (HA, x ■ Six) and London (LN, x ■ Six); LN Is an 
ocean location adjacent to the continent. Ihe entries are scaled In 
the usual way, u and v by Vq , and by «t>o * recall that with this 
scaling the observational error levels ere 0.089 for u and v, and 0.080 
for The Uji^ and Uj^jj entries are omitted because the u-errors are 
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nearly constant over the spatial domain; the 01 runs leave u_ undianged 
at analysis times » and the KB runs change ^ only slightly* 

The mininum value in each column occurs for run , as expected: 
the KB filter Is (^timal. The errors for run are only slightly 

larger than those for run Aj^ , as we saw already in Chapter 6. Notice 
that for runs A,^ and B^b , < Vy, < vh^ and 4st < ♦lN < tKA • * 

result of our conventional observing pattern. 

In contrast to the relative performance of runs Aj^ and B^b , the 
runs perform better than the runs. This is evident especially at 
SL and LN; there is little difference at HA since there are no nearby 
observations . 

The effect of initialization is most dramatic for Ug^ . The 
A-tO”l ratio of Bg to in our definition of Q, Eqs. (6.9), forces 
the true state w^ to always lie near the slow-wave subspace, and the 
analysis vector in all B-runs always lies in the slow-wave subspace. 
Since the slow-wave subspace has small u-components , the rms errors in 
ju must therefore always be small in the B-runs, 

In contrast, for runs is large and increases with a. As 

a Increases, so do the assuned forecast error variances , Therefore 
observational data are given more weight and the analyses drift further 
and further from the slow-wave subspace. 

''*HA ‘♦’HA » there is little difference between the A^ and B^ 

runs, and these errors are relatively insensitive to the size of a. 
There are no nearby observations to correct forecasts at HA, so the 
forecast error variances near HA are immaterial. However, the '^HA and 
^HA ®ftors are significantly larger for the A^, and B^j runs than for the 
Aj^ and B^b runs. Tills is due to advectlon of error from grid points 
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whldi have been updated by the obaervatlonal data* to which allghtly 
Incorrect weights have been assigned In the 01 runs* 

^SL » ’'otlce that rung A^, «nd » « > y » results quite 

comparable to runs Aj^j and . On the other hand, the vsl errors are 
ouch larger for the runs than for the Ajgj run, while the vgL etrots 
are quite comparable for the B^ and B|^ runs* figure 9 helps explain 
why this Is so. 

Figures 9a, b show the assumed andl r*v forecast error 

correlation functions, Bqs. (7.10a,b), used in the Ot runs. Figures 
9c, d show the true forecast error correlation functions, deduced from 
at 10 days, for run Ai* Figures 9e,f show the true forecast error 
correlation functions at 10 days for run Bj , The short-dashed lines 
indicate the correlation functions at SL, the loivg-^ashed lines 
correspond to HA, and the solid lines correspond to LN* The true 
correlations are not homogeneous: the curves in Figs* 9c-e do not 
superimpose* 

The correlations at SL are quite similar in Figs. 9c and 9e, 

and not altogether different from the ^-i)» correlation In Fig* 9a. 
Thus the A^ and runs are able to produce reasonably good analyses of 
at SI, provided sufficient weight, « >, y , is given to the wealth of 
obeervatlons available nearby. The v-v correlation at SI in Fig. 9d 
ia far from that in Fig. 9b, 'i*»ich was based on the geostrCphic 
assimption. Consequently, 01 does not make adequate vise of data 
available nearby and, as seen in Table 4, the Vgj^ errors for the A^ 
runs are large. The v-v correlation at SL in Fig. 9f is nuch closer 
to that in Fig. 9b; the correaponding Vg^^ errors for the B^ runs are 
nuch Itqj roved. 


/I 

-137- 

The situation is similar at LN. the portion of the solid curve 
Just to the right of center In Figs. 9d,f show that at Uv)ls In 
truth positively correlated with v at nearby land locations , with a 
large correlation coefficient. The curve In Fig. 9b ahows that the 01 
runs use Instead a negative correlation coefficient. As a result^ the 
Vj^jj errors (Table 4), for large values of a» are actually larger than 
the Vjj^ errors. In fact, the v_ analyses at LN turn out to be worse 
than the v forecasts at LN, In all and runs. 

As the curve In Fig. 9b poorly approximates the LN curves in both 
Figs. 9d and 9f , we must ask why the Vj^jj results in Table 4 are better 
for the runs than for the runs. The answer lies in what happens 
between analysis times. 

The analysis errors in the Bj^^ runs propagate at Rossby wave phase 
speeds only. The maximum Rossby speed for our model T , according to 
Table 3, Is 

Cq(3) - 13.73 m/s - 0.68 Ax/12 hr. 

Errors in the B^^ runs are therefore localized ; new observational 
Information Is inserted before error from the previous Insertion can 
travel one grid point. 

Errors in the A^ runs, on the other hand, can propagate also by 
inertia-gravity waves, which according to Table 3 travel as fast as 

cj(l) - 301*31 m/s - 14.9 Ax/12 hr. 

Error from the poor Vj^jj analysis in the A^^ runs therefore quickly 
contaminates the forecasts over land. At the next analysis time, the 
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Ajj runs then use the now large observedminus-fotf^teast residuals over 
land to produce a degraded analysis. This effect also explains a 
portion of the large Vgj^ errors for the runs. 

Notice also that the curves In Figs. 9e,f drop to zero while 
those In Figs. 9c»d do not: Initialization localizes errors and 

therefore "tightens” the correlation functions. The sante Is true of 
the cross-correlations v-^ and ^-v (not shown). Phillips (1981) points 
out that Initialization has the somevjhat paradoxical effect of forcing 
all grid variables to be changed by the Ivisertion of observational data 
at only one grid point. Our results show that there might be no need 
to worry: initialization changes the mean fields In such a way that the 
errors become more localized. 

Our experimental results make It clear that using improper 
correlation functions near boundaries separating data-dense and 
data-sparse regions can lead to unduly large errors near those 
boundaries, and that Initialization is a partial cure for this boundary 
effect . Another way to compensate for this effect Is to use forecast 
error growth rates which depend on data density. We saw already in 
Chapter 6 that for an optimal filter, growth rates over data-sparse 
regions are naturally much smaller than growth rates over data-dense 
regions (Figs. 5,6), due to advectlon of information. 

Accordingly, we conducted two more series of experiments, and 

Bg . Runs Ap identical to run Aj except that the growth rate 
Bdy is used over land and yd^ is used over the ocean. In the same way, 
runs Bp are similar to run Bj. We let y vary from 0 to 1, while we 

let B vary from I to 2, both in Increments of ^ , 

The runs Ag and Bp ^y all gave remarkably better results than the 
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Ag and runs* Basalts of the best runs, A 5 / 4 ^ 1/4 ^ are 

suanarlzed in Table 5. The analysis Is greatly laproved In both 
runs; run A 2 ^q also gives a substantial improvement for ug];, and vgi^ • 
Tie results of both runs compare favorably with those of the run. 

We conclude that use of proper forecast error variance growth rates is 
essential to the performance of 01. 

The tuning procedure we have described for determining proper 
growth rates is a manual one: analysis error variances for different 
runs were examined a posteriori , after which it was decided which 
growth rates produce the best results. For the purposes of operational 
NWP, a more systematic way of tuning the growth rates would be to do so 
adaptively; adaptive estimation was discussed » and references given, in 
Sec. 2.6. Correlation functions, especially those near boundaries 
separating data-dense and data-*sparse regions, could also be determined 
adaptively. Still, it might be better to approximate Eq. (2.22b) in a 
more direct fashion than is currently done in 01. and to do so 


adaptively 
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APPBNDIX 

PROOFS OF RESULTS IN CHAPTER 5 


Before proving Che lemmai of Section 5.2, ye revley eome 
elemencary notlone concerning iubepaces. The epan of a •et of vectore 
(X|,X 2 , , denoted hy ».»Xp>, la the set of all linear 

combinations of the vectors. Let S f (0) be a subspace of R”. There 
exists a set of linearly Independent vectors tX|,X2» . * • »5s^ 


S • <xi.X2,...,Xg> ; 


any such set Is called a basis for S, and every such set consists of 
the same lumber of vectors. That number Is called the dimension of S, 
dim S - s > 0; dim S 5 0 if S - (0) . 

If V Is a symmetric nxn matrix which Is positive definite on S, 
x'^Vx > 0 for all nonzero x e S, then there exists a V-orthonormal basis 
for S, l.e., a basis (Zi,;f2* * consisting of vectors which satisfy 




If {xi »X 2 , . . . ,Xg) Is any basis for S, then the Gram-Schmldt process . 


where 


2l - 3Sl / 4lVxi» 

^3 " Sj / 4 j * 3“2,...,s, 


Wj - - I * 

1»1 


(A. la) 
(A. lb) 


(A.lc) 
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prodicea • V-ortltonotiwl basis tXl»X2>»’* *XaJ 
basis l8 referred to simply as an orthonormal basis . 

If is a subspace of S» we denote by 
complement of Sj In S, l.e* » t!ie set of all vectors 
orthogonal to every vector In S^; 


An I-orthonormal 

®i the orthogonal 
In S which are 


" (x E S* ■ 0 for all * c Sj} . 


(A.2) 


It Is a fact that S-| Is a subspace of S and that 

H • ®1 + S-j[ - S ; (A.3ti,b) 

by the sum of two subspaces, + S2 > we mean the set of vectors of 
the form + 22 with zj^ e Sj, % S2. Since 

dim (S^ + S2) ■ dim Sj + dim S2 “ dim (Sj S2) 

for any two subspaces Sj^,S2, Eqs. (A. 3) imply tliat if dim Sj ■ q _> 0 

then dim Sj ■ s-q. A sura of complementary subspaces is said to be 
direct ! Eqs. (A.3a,b) together imply that in fact every vector z c S 
can be uniquely expressed in the form z - zj + 22 where 21 ® ^1 and 
22 e s\ . 

For furtlier reference see, for example, Nerlng (1970, especially 
Secs. 1.3, 1.4, 4.4, 5.1, 5.4). 


A.l. Proof of Lemma 1 

We give a constructive proof. The construction will be used in 
the proof of Lemma 4, and serves as a model for the proof of Theorem 2. 


» & 


« 1 - 
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If S ■ {0) • then n ■ 0 is a V-orthogoMl projection onto S for all 

'/ 

V« We therefore assume S {0), so that dim S ■ s > 0. 

Let Sj ■ S n Ker V, with dim Sj ■ q» and let S-^ be the orthogonal 
conplement of Sj In S, defined by Eq. (A. 2); dim s\ ■ s-q. For now we 
assume q > 0 and s-q > 0; we return later to the case in which q ■ 0 or 
8-q ■ 0. We show that the required projection can be expressed as (the 
sum of two projections, one onto S| and one onto S'^. 

First we show that 

> 0 for all nonzero e S'j^. (A. 4) 

We have 

si n Ker V - (S n S^) n Ker V 

- sj n (s n Ker v) 

- s| n Sj - {0}, 


so 0 for all nonzero e Sp Factoring V as V = , we 

therefore have jt Q, so that 2—^ “ ^ from which 

(A. 4) follows since V is positive semldef Inite* 

It follows from (A,4) that there exists a V-orthonormal basis for 

si. 


®i * ^X.I»Y2» • • • »Zs-q^» (A. 5a) 

Zi’^Zj “ «ij * i.j ■ 1, . ..,8-q. (A.5b) 

We now define 

"V * I * 

i-1 


(A.6) 
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and show tl»t Hy la a V-ottl»gonftl projection onto ${• 
Range Wy C s\. If it e R" the^ 


"vS " I <3Ll'^5)Xl ® 

i-1 


by Eqa. (A. 6, A. 5a). 

Range Ry ^ S^. Mb want to ahow tlat If t ® S’j[» there la a 

vector X c R*' such that Hyx - t- ^ 5 " t ® vector, 

because from Eq. (A.5a) we have 

8-q 

“JXJ 

J-1 

for some scalars , whence Eqs. (A. 6, A.5b) give 

8-q 8-q 

"vx I I “jxixrxj 
1-1 J-l 

S-q 8-q 

- I I «jXi«lj - X- 
1-1 J-l 

n j - Ry. From Eqs. (A.6, A.5b) we Imve 

n5 • T XDtlv T ijll'' 

1-1 J-l 

S-q S-q 

« I Xi I ^IjXj'^ " "v 
1-1 J-l 

(VRy)"^ - VRy. From Eq. (A.6) we lave 

(v'^Ry)’' - ( I v’’^XiXiV)'^ 

i«i 


: SS:.. *■ 


'i 
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- T . v\ , 

Iwl 

and the reeult f ollows since ■ V. 

Having shown that Ily is a V-orthogonal projection onto S-^, we now 
let orthonormal basis for S| , 


Si 

T 

SiSj 


^£l »S2> * • * *5.q^* 

fi i j * " !•»••• *9» 


and we define 



(A.7a) 

(A.7b) 


(A.8) 


From the proof that Ily Is a V-orthogonal projection onto S-^, It Is 
clear that IT^ Is an (I-) orthogonal projection onto Sj. 

Defining further 

n - Hy + Hi , (A.9) 


we show that II Is a V-orthogonal projection onto S. We will make use of 
the facts that 


xjzj - 0 , 1 ■ l,..,,8-q, j 


1 * • • »»9* 


and 

^-1 


(A.lOa) 


(A. 10b) 


the former equality follows from the definition of Sj, Eq . (A.2), 

while the latter follows from the fact that Sj » S ^ Ker V is a subset 


of Ker V 


OfMOlNAL PAGE 13 
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Range II C s « If x c R" then 

Hx «■ IlyX + Ilix E s| + Sj , 

since Ryx e Sj and Rjx e Sj , and the result follows since Sj + Sj ■ S, 
Eq. (A.3b). 

Range II ^ S . Suppose w e S. We show that Ilw ■ w. According to 

Eqi (A.3b), there Is a vector e S'jji and a vector z e Sj such that w ■ 

J^+z. Now Ilyz ■ 0 by Eqs. (A, 6, A. 7a, A. 10b), and njjr - 0 by Eqs. 

(Ar5a, A. 8, A. 10a). Therefore 

nw - oiv + niXz + s) “ "vX • 

But Hy^ » ^ according to the proof that Range jHy ^s|; Bimllarly, IIjZ ■ 
z. Therefore Ilw « y + z ■ w . 

0*0 ^ m 0 0 ^ 

- n. Suppose X E R". and let w ■ llx. Then w e S. since 
■ ^ • 000 ^ 000 0 ^ 000 ^ 

Range II CS, and Ilw ■ w, according to the proof that Range II ^ S* 

Therefore 

n^x » nw ■ w « nx , 

000 000 000 000 . -0 

l.e. , n^x « nx for all X e r”, and therefore n^ = R. 

( Vn )^ * Vn . Prom Eqs. (A. 8, A. 10b) , it follows that VH j ■ 0. 
Therefore Vn “ VRy , and the result follows since we already showed 
that (VRy)'^ - Wy. 

This concludes the proof in case q > 0 and s-q > 0. In summary, 
starting from arbitrary bases for S|^ and S-^, one could use the 


-151- 

Gram-Schmidt process (A*l) to construct the baaes In Eqs. (A*5| A.7), 

and then construct the projection H by means of Eqs. (A. 6, A. 8, A.9). 

In case q ■ 0, l.e. » ■ (O) , Eq* (A.3b) implies that ■ S and 

dim S-^ >■ s-q " s > 0, so that Ily In Eq . (A. 6) Is a V-orthogonal 

projection onto S. Similarly , if s-q ■ 0, then II j in Eq. (A. 8) is a 
V-orthogonal projection onto S» for all V. 

A. 2. Proof of Lemma 2 

♦ 

Sufficiency . Let n and A be two V-orthogonal projections onto S, 
and let A ■ II - A . We show that if S n Ker V ■ (0) , then A ■ 0. 

If xe R"» then Axe S, so Eq. (5.2) implies that IIAx ■Ax. 
Since X is arbitrary, we have 


HA - A, 


and similarly, 


An - n. 


Therefore, 

A - n - A ■ An - A ■ A (n - i) , 

and 

VA - VA(n - I). 

We also have 

VA - (VA)'^ - a'^v - (nA)’^V 

- A%% - A'^(vn)'^ - A'^^vn , 

so that 
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VA - ATvn(n - I) - A'^v(n2 - n> - o » 

and therefore Ax e Ker V for every x e R”* But Ax - nx - Ax e S fot 

every x e R*» since llx e S and Ax e S and S is a subspace. Therefore 
Ax e S n Ker V • {0} . That is, Ax ■ 0 for all x c R”, whence A ■ 0. 

Necessity* Suppose that S n Ker V ^ {0} and let II be a 
V-orthogonal projection onto S. We construct a matrix A ^ H which is 
also a V-orthogonal projection onto S. 

Let V e R” be any nonzero vector such that v'^w «• 0 for all w c S, 

i.e. » V jt 0 and v e S'^, the orthogonal complement of S in R-. There 

are many such vectors v: letting dim S ■ s, Eqs. (A. 3) imply that 
dim » n-s > 0, since S is a proper subspace of R”. 

Let z e S n Ker V with z 0. We claim that 

A - n + zv*^ 

is a V-orthogonal projection onto S. Obviously A II, since both v and 
£ are nonzero* 

Range A C S * If x e R", then 

^5 “ Bx 4- (v'^x)z e S , 

since IIx e S, z e S, and $ is a subspace* 

Range A ^ S * Let w e S, so that IIw * w* We have also v"^w » 0, 

since V e S'*"* Therefore 
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l.e. , Ax ■ w has solution x - w for all w e S. 

^ 0t0. M*' 

A^ * A« As In the proof that 11^ H In the proof of Lemma 1, 
Range A C S and Aw ■ w for all w e S together imply that A^ ™ A» 

(VA)**- *■ VA» Vte have 

VA - vn + (vz)v’’^ - vn 

since z e Ker V, and the result follows since (VII ■ Vn. 

A.3. Proof of Lemma 3 

Ue show tliat the general solution of problem (5.5) is given by 

X - + z , (A. 11a) 

where H is any V-orthogonal projection onto S (such projections exist 
by Lemma 1), and where z is any vector such that 

z c S n Ker V. (A. 11b) 

The lemma follows immediately from this result, for if S H Ker V « (0) 
then z ■ 0 and, according to Lemma 2, H is unique, while if sn Ker V ^ 
{0} then neither z nor II is unique. 

Let n be a fixed V-orthogonal projection onto S, and define y ■ 
Y(x) by 

Y(x) - (x - x)'^V(x - jr) . 

We have 
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If (5) ■ Kx-flx) + (nx-x) Fvi (x-nx) + (nx-x)l 

■ (x-nx)^v(x-nx) + (nx-x)^v(nx-x) + 2(x-nx)^v(nx-x)» 

The last term vanishes If x c S, for then jc ■ IIx by Bq. (5.2), whence 

(5-nx)^v(nx-x) ■ (nx-nx)^v(nx-x) 

- (x-x)%'^v(n-i)x - 0 ; 

the last equality follows from Eqs. (5.1b, 5.3) and the fact that V Is 
symmetric, 

n'^v(n-i) - (vn)’^(n-i) - vn(n-i) 

p V(il2 - n) - 0 . (A.12) 

Defining z * z(x) by 

5 * X - IlX » (A. 13) 

we therefore have 

Y(x) - z'^^Vz + g if X e S, 

where 

5 - (nx - x)’^v(nx - x) 

is independent of x. Now z*^Vz 0 since V is positive scmideflnite, so 
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Thereforc x i« o iolutlon of problem (5.5) if end only If 

Y(x) ■ 5 end x c S, 
and this Is the ceiie If and only If 

■ 0 and £ e S ; (A. 14) 

tnat the stateinents x e S and £ c S are equivalent follows from (A. 13) 
and the fact that S is a subspace. 

The first condition in (A. 14) is equivalent to z c Ker V, for if 
z'^Vz n 0 then, factoring V as V ■ vJVj , we have (Vi£)'*'(Vjz) - 0, 
whence Vjz ■ 0 and Vz ■ VjV^z « Oj obvipuBly Vz ■ 0 implies z'^Vz ■ 0. 
Conditions (A.l4) are therefore equivalent to (A.llb), so x is a 
solution of problem (5.5) if and only if it is of the form (A.ll), 
where II is any V-orthogonal projection onto S. 

A.4e Proof of Lemma 4 

Let s ■ dim S, and let G be an nxs matrix whose s columns are an 
orthonorroal basis for S, 

^ SiHj - «ij* 

The (1,J)*’*' element of the matrix C^C is wjwj , and therefore 

C'^C - I . (A.lSa) 

We have also 

CCT . I 
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whlch, according to the proof of l«nna I, Is an orthogonal projection 
onto S. Since there Is only one such projection, we therefore have 

III " (A. 1 5b) 

Now define 

Gy - 

From Eqs. (5.7), we have 

(V-1/2)Tvv-1/2 . 

and therefore 

CyVCy - (V"^/2c)'^V(V"^/2c) 

* G^C 

- I ; 

that is, the columns of Cy are a V-orthonormal basis for S. 

According to the proof of Lemma 1, and analogously with Eqs. 
(A. 15), the matrix CyCyV Is therefore a V-orthogonal projection onto S. 
There is only one such projection since V is positive definite, so 

Ily ■ GyGyV. 

But then 

Hy - c) (V-1/2 c)T V 

« y-1/2 Q qT (y-l/2)T V 

. V-1/2 (y-l/2)T y 

- V^/2 yl/2 ^ 


since (V“^/2)T y = yl/2. 
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A (St Proof of theorem 1 

The proof Is similar to that of Lemma 3. The uniqueness part of 
the theorem follows Immediately from Lemma 2. If A satisfies Eq# 
(5.21), then Lj^ - 0 according to Eq. (5.20), and Lemma 2 implies 
uniqueness of the A-orthogonal projection, ■ n ; % In Eq. (5.18) 
is therefore uniquely determined. On the other liand, if A does not 
satisfy Eq . (5.21), then Ilj^ and Lj^ , and therefore Kj^ and , are not 

uniquely determined. 

It therefore remains only to verify tliat the general solution of 
problem (5.17) Is given by Eq. (5.18). We omit subscripts In the 
remainder of the proof; an observation time k is atBuiaed. 

Let n be a fixed A-orthogonal projection matrix onto R ; that such 
a matrix exists Is a consequence of Lemma 1. The constraint (5.17b) 
states that each column of K roust lie In R, so It follows from Eq. 
(5.2) that the constraint Is equivalent to requiring 

nK - K. (A. 1 6) 

Indicating tlie dependence of n upon K by writing r) ■ n(k), we find a 
simple fornula for n(IlK) from which, with Eq. (A,16), the general 
solution (5.18) will follow. 

Fr om Eq . (2.16), we have 

n(nK) - trace [A(nK - 1^)C(11K - + AZ] , (A. 17) 

where, according to Eqs. (2.14b, 2.15b, 2.20a), 

G - HP^h'^ + R , 
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Z - pf - P*h'*^C"^P* , 
kKB ■ P^h’^C'^ 


We now show that Eq. (A.17) can be written 

n<nK) - trace A[n(K-#®)]c[n<K-#’*)]'^ 

+ trace Al(I-n)K*®l Cl(l-n)KKBjT + trace AZ (A. 18) 

We expand the first term inEq. (A.17) as. 

A<nK - #") C(nK - #®)T 

- Ain(K-K’^®)-(i-n)#®} cin(K-K’®)-(i-n)K'^®]’^ 

- AlnCK-K*^®)] Gln(K-KK®)]T + A((I-n)KKB] ci(i-n)K’®l'*^ 

- A(I-n)KK®C(K-#®)V - An(K-K*®)C<#®)'J^(I-n)'^. (A.19) 

Now, 

trace A(I-n)#®C(K-K’^)%'*^ - trace n%(I-n)K^C<K-K‘®)'^ 

- 0 . (A.20a) 

The first equality in (A. 20a) follows from Eq. (2. lie), while the 
second follows from the fact that 

n'^Ad - n) - 0 ; 

cf. Eq. (A.12). Similarly, we have 


0 . 


trace An(K - k!^) C(K*®)’^^(I - n)'^ 


(A.20b) 
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Insettlng Eq« (A.19) into Eq. (h,l7)t and using Eqs* (A.20, 2.11b), 
yields Eq. (A.18). 

The last two terms in Eq. (A.18) are independent of K. 

^lalogously to the argument following Eqs. (2.18, 2.19), it follows 
that n(nK) is minimised with respect to K if and only if 

AjnXK - - 0, (A.21a) 

where A has been factored as A - A|Aj. If Eq. (A.21a) holds, then 

An(K “ #®) - 0 i (A.2lb) 

On the other hand, If (A. 2 lb) holds, tlien the first term on the 
right-hand side of Eq. (A.18) vanishes and n(HK) is minimized: 
conditions (A.2 la, A.2 lb) are equivalent. Therefore, a matrix K is a 
solution of problem (5.17) if and only if K satisfies Eqs. (A. 16, 
A.21b), i.e., iff 

nK - K , 

An K - An#® . 

We seek solutions of Eqs. (A.2 2) of the form 

K • JIK*^ + L , (A.23) 

where L is yet to be determined. Substituting Eq. (A.23) into Eqs. 
(A.22), and using Eq. (5.19b), we find that 


(A.22a) 
(A.2 2b) 
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HL - L » (A.24a) 

AHL » 0 . (A.2Ab) 

Equation (A.24a) statas that each column of L lies In R » and Eq. 
(A*24b) states that each column of IlL ■ L lies In the kernel of A; that 
Is, Eqs. (A.24) are equivalent to 

Range L c R n Ker A. (A»25) 

Equations (A.23, A, 25), with II being any A-orthogonal projection onto 
R , therefore represent the general solution of problem (5.17). 

A. 6. Proof of LemiTB 5 

If A Is real then the submatrices Aj are real, and the fpcmula 

M/2 

A(m) - I A. 

j— M/2+1 

immediately implies that the matrices A(a)) satisfy Eqs. (5.33a, b). 
Conversely, If the matrices A(u)) satisfy Eqs. (5.33a,b), then the 
Inversion fornula 

A. - i T e-2iTiju»/M ^(a,) 

” (i)— M/2+1 

Immediately implies that the Aj are real, and hence that A is real. 

If A Is symmetric and real, i.e., A ■ A^ ■ A*, then we have from 


A - FAF* that 
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X* - FA*F* - FAF* - A , 


which, with Eqi (5.30), Implies Eq. (5.33c). Conversely, if Eqs. 
(5.33a,b,c) are satisfied, then A*^ « A^, as we have just shown, and 
A* ■ A ; f rom A ■ f*Xf we therefore have 

A*^ - A* - F*A*F - F*AF - A, 


l.e. , A is symmetric. 

For positive semidefiniteness. It Is clear from Eqs. (5.30, 5.32) 
that the following statements are equivalent; 


x'^Ax 2 0, 
x'^F*AFx > 0 , 
(Fx)*A(Fx) > 0 , 

z*h > 0 . 

X*(u))A(aj)j7(w) 2 0, 


for all X c r”, 
for all X e R”, 
for all X e R”, 

for all jr of the form y ■ Fx, x e R”, 
for all complex 3-vectors y(w) , 


and for all oi • — "FI , . .. , ^ . 


That the conditions in the last two statements are equivalent follows 
from the fact that If * Fx and x g R” Is arbitrary, then ;^(oj) is, for 
each 0 ), an arbitrary complex 3-vector; conversely, if the last 
statement is true, then in particular it is true for arbitrary vectors 
satisfying J|;(-u)) - y^(w) end x(M/2) » ^(M/2), in which case x - f\ is 
an arbitrary real n’-veGtor. 


A *7. Proof of Lemm 6 



(5.39a) ■> (5,39b) . Suppose there Is an w, say w* , §uch that 

A(w*) ro(w*) - 0 ; 

we show tlwt this Implies RH Ker A j* {0}. Suppose for now that 
w* fWll. Then from Eqs. (5.33a, 5.35a) we have also 

A("W*) ro(-w*) - 0 , 

A 

Def ine the n-yector w ■ Fw by 

w(±w*) * ro(±o)*) , 
w(o)) ■ 0 if w ± ui* . 

Then w i 0 and, according toEq. (5.36), we have we R . Clearly 
A((o) w(w) ■ 0 for allu), and therefore 

0 - Aw - (FAF*) (Fw) - FAw , 

<dience Aw ■ 0 since F Is nonsingular. Therefore w is a nonzero vector 
in Rn Ker A. If w* ■ M/2, the same result obtains by setting 

w(M/2) - ro(M/2) , 
w(to) “ 


0 if w 1* M/2. 
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(5.39b) ■> (5.39c) » Suppose 

A(w) rQ<w) 0 

for all u>. Since A*(w) “ A(w) by Eq. (5.33c), A(u)) has a 

factorization 


A(m) - A*(u) A2 ((ij), 


Since A(w) £ 0 ( 0 ) ) A 0, we have Aj(w) rQ(u) A 0, so 


rQ(ui )A(ai )rQ(w ) * lAj(<»)) rQ(u)]* [Ai(w) ro(w)] f 0, 


which, according toEq. (5.33d), implies that 

£0(64) A(u») rQ((o) > 0 . 

(5.39c) »> ( 5»39a) . Suppose that R n Ker A {0} ; we show that 
there is an w for which (5.39c) does not hold. lat 


w e R Ke r A 


with w 0. Since w e R , Bq . (5.36) implies that there is an u, say 

w* , su<vh. that w(o>*) » p rQ((a*) with p f 0. Since Aw » 0, we have 

Aw » (FAF*)(Fw) -FAw - 0. 


Therefore 
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^ w(w*) - 0 , 


whence 

Eo^“*) A(w*) Co((i>*) ■ 0 . 


A. 8 . Proof of Theorem 2 

That there exists a unique A-orthogonal projection onto R follows 
from Lemma 2, so we need only verify that 11 defined by Bqs. (5.41) is 
Indeed an A-orthogonal projection. This II Is certainly block clrculant 
since It Is defined by II ■ F*IIF, with II block-dl/agonal. 

Range K C R . Let w e r” and let * ■ IIw ; we want to show that 
z e R . We have 


z ■ Rz * RHw ■ (FRF*) (Fw) “ Rw , 
so 

z(w) “n(w) w((o) “ * 

where, according to Eq . (5.41c), 

0o(“) “ A(w) w((i))l. 

Now Is real, according to Eqs. (5.39c, 5.41d), so from Eqs. 

(5. 33a, b, 5. 35, 5. 37) It follows that 

* *** * 0»1»***» M/2 — 1 
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60 (?) - 6o(?)- 

Accotding to Eq. (5.36), therefore, we have z e R . 

Range II 3 R . Let w e R ; we show that IIx ■ w has solution x ■ 
w. Since we R, Bq. (5.36) Implies that 

w(u) - 

for some scalars PQ(aj), whence we have froraEqs. (5.41c,d) that 

n(ai) w(o)) • Eo^“^ 

“ Eo^“^ * w(uj). 

^ A A 

Therefore Ilw * w and 

nw = (F*nF)(F’*w) « F*nw » f*w « w. 

n ^ * n . It follows immediately from Eqs. (5.41c,d) that 

[n(aj)]2 - n(aj) , 

so E" * n and 

n2 » (F*nF)(F*nF) - f^E^f - f*Ef n. 

( AII)’^ » AH. Prora Eq. (5.41c) we have 

[A(a} )n (w)]* <= [aj^A(u)ro(u)ro(w)A({i3 )] * 


ORIGINAL PAGE IS 

OF POOR QUALITY 


-166- 

)to(« )to(“ ) ■ A(w )n (w ) ; 

the second equality follows since is real and, according toEq. 
(5.33c), X*(w) ■ X(u). Therefore we have 

(An)* - An . 

It follows from Eqs. (5. 33a, b, 5.35, 5. Ale) that 

n (— w ) — n (w ) , u5 — 0, 1 , . . . , m / 2 “1 
n(^) - n(|), 

so n is real; since A is also real we therefore have 

(An)'^ - (An)*. 

Then 

(An)'f - (An)* - (F*AFF*nF)* - (F*AnF)* 

- F*(An)*F - F*AnF - (F*AF)(F*nF) 

- An . 
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(A) 

°0 

c-i 

Cl 

1 

7.51 

-255.77 

308.26 

2 

14.14 

-182.95 

228.81 

3 

16.89 

-166.88 

209.99 

A 

18.12 

-161.00 

202.87 

5 

18.76 

-158.22 

199.46 

6 

19.12 

-156.70 

197.58 

7 

19.35 

-155.78 

196.43 

8 

19.50 

-155.18 

195.68 


TABLE 1 . Phase speeds of solutions of the continuous system, Eqs. 
(3. I, 3. 3), in meters per second. The phase speeds are given by Eq. 
(3.22a), and are presented here for the first eight wave numbers. The 
speeds Cq are the Rossby wave phase speeds, while c_]^ and c^ are, 
respectively, the phase speeds or westward-propagating and 

eastward-propagating inertia-gravity waves. 



7.53 


-255.99 


308.45 



TABLE 2. As in Table 1, but using the approximate formulas (3.25) 







7.44 

-248.92 

301.31 

2 

13,12 

-164.92 

208.45 

3 

13.73 

-133.39 

17;i .52 

4 

11.98 

-107.91 

141.65 

5 

9.17 

- 82.00 

110.85 

6 

5,95 

- 54.49 

76.45 

7 

2.79 

- 26.22 

38.13 

8 

0.00 

0.00 

0.00 


TABLE 3 . As In Table 1, but for the discrete system* Eqs. (4.7). The 
phase speeds are given by Eq. (4.34), and are obtained by first 
solving the eigenvalue problem, Eq, (4.27), and then solving Eq. 
(4.28) for the elgenfrequencies. 
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RUN 

“ " '1 

“SL 

■^SL 

VRA 

VLN 

♦ SL 

♦ha 

♦ln 

%B 

,031 

.070 

.311 

.214 

.045 

.210 

.117 

Aq 

.127 

.302 

.390 

.334 

.220 

.297 

.313 

h/z 

.144 

.174 

.371 

.358 

.070 

.261 

.194 

Al 

.153 

.163 

.385 

.412 

.059 

.262 

.180 

h/z 

.167 

.165 

.398 

.457 

.062 

.272 

.201 

CM 

< 

.184 

.172 

.410 

.493 

.065 

.285 

.235 

®KB 

.074 

.092 

.317 

.223 

.064 

.217 

.127 

®0 

.075 

.265 

.371 

.303 

.183 

.283 

.286 

® l /2 

.074 

.113 

.364 

.313 

.068 

.263 

.198 

Bl 

.074 

.100 

.381 

.363 

.053 

.267 

.173 

® 3/2 

.074 

.099 

.394 

.394 

.051 

.275 

.170 

»2 

.074 

.100 

.404 

.412 

.051 

.. 

.284 

.174 


TABLE 4 . Summary of rms analysis errors at 10 dayst for runs Aj^ , 
\ * ®KB » ®a » for o - 0, 1/2, 1, 3/2, 2. 
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■jjRQ 

“SL 

VSL 

VHA 


♦ SL 

♦ha 

♦ln 


,031 

i070 

.311 

.214 

.045 

.210 

.117 

^2,0 

.073 

.096 

.374 

.282 

.055 

.259 

.166 

% 

.074 

.092 

.317 

.223 

.064 

.217 

.127 

®5/A,l/4 

.074 

.098 

.369 

.287 

-.052 

.256 

.165 


TABLE 5 » Sumraary of rras analysis errors at 10 days, for runs Aj^g , 
^2,0 » ®KB ♦ ®5/4,l/A • 








FIGURE CAPTIONS 


Figure 1* An Illustration of neteorologlcal observations available 
at or near 1200 CMT, January 9 » 1979. The various observing systems, 
as well as the error structures of data they provide, are described in 
Section 1.2. This figure is reproduced from the preface of Bengtsson 
et al . (1981), by permission of the publisher. 

Figure 2. A schematic representation of three projections onto the 
slow-wave subspace R . As discussed in Section 5.5, the projections 
do not coincide because the fast-wave subspace G, is not orthogonal to 
the slow-wave subs pace. The points labeled II|X , Hj^x , and n^x are, 
respectively, the parallel project ion, the orthogonal projection, and 
the miniraura-eneigy projection of a point x onto the slow-wave subspace. 

Fi^re 3. Time history of the state estimates w|^ at three 
locations, for the experiment using the standard KB filter. The 
selected locations are labeled SF (for San Francisco, x ■ -7Ax), NY 
(for New York, x ■ OAx), and HA (for Hawaii, x ■ 5 Ax). Figure 3a 

shows the u-conponent of velocity. Figure 3b shows the v-coraponent of 
velocity, and Figure 3c shows the geopc^>intial Notice the slow 

waves with a period of approximately 6 days, upon which are 
superimposed smaller-amplitude fast waves. 

Figure 4. Same as Figure 3, but for the experiment with the 
modified KB filter. The fast waves have been completely eliminated and 
the state estimates evolve slowly. 

Fl^re 5. Components of the expected rras estimation error, for the 
experiment with the standard KB filter. Figure 5a shows the expected 
rms error over land, Figure 5b owr the ocean, and Figure 5c over the 
entire domain. The Individual curves are labeled U, V, P and E, for 


-17> 


thci Mpecttd riM error In u, v» 4 total enargyi averaged over 

the Indicated region* The error at aynqptic tiaes decreaeee 

loinedlately over land| and more gradually over the ocean* In between 
aynoptlc tlmBe* the error over land Increaaea more aharply than the 
error over the ocean* due to advectlon of error from over the 
data-sparae ocean* Each curve converge! rapidly to a periodic 

function* 

Figure 6* Same as Figure 5, but for the experiment using the 
modified KB filter. Estimation errors In this case are nearly 

t 

Identical to those resulting from use of the standard KB filter, except 
that the u-conponent errors now Increase slightly with time* This Is 
due to the fact that, since the slow-wave subspace is quaslgeostrophlc, 
the modified filter allows almost no observational correction to be 
performed on the u-epraponents * 

Figure 7, Influence functions of selected observation stations at 
10 days, or k * 480 time steps, for the experiment with the standard KB 

filter. The influence functions of an observation station are obtained 

from columns of Kj^; they show the weight give to an observation of u, 
V or 4 at that station when updating points throughout the domain* 
Grid points are indicated by tick marks on the horizontal axis; the 
horizontal parallel lines and vertical dashed lines Indicate the 
observed regton, or land. The selected observation stations are SL 
(for Saint Louis, x ■ -3Ax), SF and NY (see Figure 3). 'fhe nine 
panels, 7a-7l, give the Influence of (a) u observations on u 
corrections, (b) v on u, (c) 4 on u» (<*) n on V, (e) v on v, (f) 4 on 
V, (g) u on 4 , (h) V on 4 , (1) 4 on 4 * Particularities of the curves 
are discussed in Sec* 6*2* 


0 
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Figufe 8. Same as Figure 7, but at the flrat aynoptlc time» k ^ 24 
time ateps. The Influence functions hiere are such more symmetric than 
those In Figure 7. Conparlson of these two figures allows one to 
distinguish between the effect of inhomogeneous data density and the 
effect of advectlon of Information, as discussed In the text. 

Figure 9. Plots of and vv forecast error correlation 

functions. Figures 9a, b show the correlation functions and 

, respectively , which are prescribed In 01, cf* Eqs. 
(7.10a,b). These two plots are generated by evaluating at the 

grid points x^, j “ -8,-7, .. *,7,8 ; Is homogeneous, or 

Independent of the base point x^ , Figures 9c, d show the true and 
\rv forecast error correlation functions, respectively, computed from 
at 10 days, for run Aj . These correlation functions are drawn for 
the base points SL (x^ ■ -3 Ax)» HA (x^ ■ 5 Ax) and LN (x^ • 8 Ax). 
Figures 9e,f show the same correlation functions as Figures 9c, d, but 
for the Initialized run , Comparison of the prescribed and true 
correlation functions helps explain the results of the 01 experiments 
which are summarized in Table 4. 
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Air»p« 
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Buoys 

Colbi\i 

Drops 

Pilots 
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Bbips 

•ysops 
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High quality wind obssrvstlons tm» slrcrsft 

Surfses prsBsure obssrvstlons fros drlltlng buoys 

Constant Isvsl balloons 

Rsdloaondcfl droppsd fros aircraft 

Wind AsavursBsnts fros ascsndlng balloons 

TbBpsraturs MsasursBsnts froa polar orbiting satsllitss 

Cloud drift wind asasursMsnts frcs gsostatlonSJT^r s^tsUltss 

iurfacs obssryatlons frois sbips 

Burf acs obssrvations irovjt land 
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Figure 1 
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